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DNA is the carrier molecule of the genetic information, and has to pass it 
on the daughter cells in respect of this information. Any modifications in 
the DNA molecule between two cell generations will result into a mutation. 
Besides maintaining the DNA molecule’s integrity in order to preserve the 
genetic message, gene expression also has to reflect the encoded informa¬ 
tion carried by the DNA molecule and be delivered in an accurate way ac¬ 
cording to external and internal stimuli. To ensure this accuracy of the gene 
expression, quality controls are present for each step of gene expression from 
DNA replication until the folding and the posttranslational modifications 
of the protein (Araki and Nagata, 2011; Isken and Maquat, 2007; Kilchert 
and Vasiljeva, 2013; Liu et ah, 2014; Lykke-Andersen and Bennett, 2014; 
Popp and Maquat, 2013; Porrua and Libri, 2015; Schmid and Jensen, 2013; 
Walters and Parker, 2014; Zhai and Xiang, 2014). This chapter will focus 


Nonsense Mutation Correction in Human Diseases 
Copyright © 2016 Elsevier Inc. All rights reserved. 


1 


2 


Nonsense Mutation Correction in Human Diseases 


on one of these steps, that is, the mRNA quality control occurring after 
pre-mRNA splicing, and before the bulk of translation called nonsense- 
mediated mRNA decay (NMD), and on mechanisms directly related to it. 

1 PREMATURE TERMINATION CODON, NONSENSE 
MUTATION, AND CONSEQUENCES ON GENE EXPRESSION 

An open reading frame (ORF) starts with a translation initiation codon, which 
is often an AUG codon, and finishes with a stop codon (UGA, UAG, or UAA). 
When an additional stop codon is present inside of an ORF, meaning down¬ 
stream of the initiation codon and upstream of the stop codon ending the 
encoding of the accurate C-terminal part of the wdd-type protein, it is called 
premature termination codon (PTC). PTC can be introduced in an ORF, as a 
consequence of various events, such as a point mutation changing a coding co¬ 
don into a stop codon (we then speak of“nonsense mutation”), or an insertion 
or a deletion inducing a frameshift mutation leading to the apparition of an 
in-phase PTC (Fig. 1.1 for the events at the DNA level). Insertions or deletions 
can occur at the DNA level by insertion or excision of DNA (transposable 
elements for example) or at the RNA level during pre-mRNA splicing, after 
a mutation located in an intron or in an exon and compromizing the recogni¬ 
tion of splice sites. Indeed, some mutations can induce partial or total intron 
retention, or total or partial exon skipping (Fig. 1.2). It is worth noting that all 
nonsense mutations are PTCs,but all PTCs are not nonsense mutations. 

Statistical analysis of the distribution of the three stop codons at the normal 
translation termination position reveals that the UGA stop codon is the most 
frequent stop codon with 47% of the normal termination codons, then the 
stop codon UAA with 30% and, finally, UAG stop codon with 23% (Atkinson 
and Martin, 1994) (Table 1.1). Interestingly, the analysis of the distribution of 
the three stop codons as PTC is a little bit different, since the most frequent 
one is UGA, found in 51% of the nonsense mutations, then UAG with 31% 
and, finally, UAA with 18%. The frequency of the identity of nonsense muta¬ 
tions can be explained by the nature of codons that can be mutated into a stop 
codon. Indeed, TAG stop codon mainly comes from codons GAG (Gin) or 
TGG (Trp),TAA stop codon comes from mutations in codon CAA (Gki) 
or GAA (Glu), andTGA often derives from mutations in codon CGA (Arg) 
or TGG (Trp).This origin of stop codon can be partially explained by the fact 
that the most frequent mutation is the transition C^T (44%); this is induced 
by a deamination of the cytidine that converts the cytidine into a uracil which 
win be corrected into a thymidine (Fig. 1.3), since the mutation is occurring 
in the DNA molecule (Atkinson and Martin, 1994) (Table 1.1). 
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Figure 1.1 Molecular events at the DMA level leading to the introduction of a PTC in the 
ORFofan mRNA. The nucleotide triplet sequence is indicated and shows how various 
mutations can lead to the appearance of a PTC. 
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Figure 1.2 Examples of mutations with consequences on pre-mRNA splicing. At the top 
of the figure, a point mutation affects a 5' splice site, promoting the retention of the 
intron in the mRNA, leading to the introduction of a PTC in the ORF. For the lower ex¬ 
ample, a point mutation affects a 3' splice site, making it not recognized by the splicing 
machinery. The consequence is the skipping of the exon 2 inducing a frameshift and the 
introduction of a PTC in the ORF. 
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Table 1 .1 Comparison of the frequency of each stop codon as physiological codons 
and as premature termination codons 

Physiological STOP codon 


Codon identity 

Frequency (%) 

UGA 

47 

UAG 

23 

UAA 

30 

Nonsense codon 

Codon identity 

Frequency (%) 

UGA 

51 

UAG 

31 

UAA 

18 



Figure 1.3 Transition of a cytidine into a thymidine. The cytidine is first subject to a 
deamination to generate a uridine. Since the reaction is occurring in DNA, the uridine is 
then subject to a methylation, in order to convert the uridine into a thymidine. 
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The statistical analysis of the mutated codons at the origin of a nonsense 
mutation shows that none of the codons encoding alanine, asparagine, aspactic 
acid, histidine, isoleucine, methionine, phenylalanine, proHne, threonine, or va- 
Hne can be replaced by a stop codon after a point mutation. UGA stop codon 
is the only stop codon to have exclusivity in the replacement of some amino 
acids. Indeed, codons encoding arginine, cysteine, or glycine can only be re¬ 
placed by UGA stop codon after point mutation. In contrast, the three stop 
codons can be found replacing the position of a leucine or a serine (Table 1.2). 

The consequence of the presence of a PTC on gene expression is rarely the 
synthesis of a truncated protein, as it will be explained later, but the silencing 
of the gene is due to the activation of a RNA surveillance mechanism called 
NMD that recognizes and degrades specifically mRNAs harboring a PTC. 


Table 1.2 Identity and distribution of the codons leading to stop codon after a point 
mutation 


Original 

codon 

Associated 
amino acid 

Per stop 
codon (%) 

All stop 
codons (%) 

Resulting 

STOP codon 

GAG 

Gin 

41.07 

12.84 

TAG 

TGG 

Trp 

23.21 

7.26 


GAG 

Glu 

16.07 

5.02 


TAG 

Tyr 

10.7 

3.35 


AAG 

Lys 

5.35 

1.67 


TTG 

Leu 

1.78 

0.55 


TAT 

Tyr 

1.78 

0.55 


TGG 

Ser 

0 

0 


Total 


100 



GAA 

Glu 

40 

7.82 

TAA 

GAA 

Gin 

28.57 

5.58 


TAT 

Tyr 

14.28 

2.79 


TAG 

Tyr 

8.57 

1.67 


TGA 

Ser 

2.85 

0.55 


TTA 

Leu 

2.85 

0.55 


AAA 

Lys 

2.85 

0.55 


Total 


100 



GGA 

Arg 

62.5 

30.7 

TGA 

TGG 

Trp 

17.04 

8.37 


GGA 

Gly 

5.68 

2.79 


TGG 

Gys 

5.68 

2.79 


TGA 

Ser 

4.54 

2.23 


TGT 

Gys 

3.4 

1.67 


TTA 

Leu 

11.3 

0.55 


AGA 

Arg 

0 

0 


Total 


100 



Total 



100 
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Due to the degradation of the PTC-containing mRNA, the corresponding 
gene is not expressed at the protein level, leading to the silencing of the gene. 
NMD obeys specific rules and prevents the synthesis of truncated proteins with 
no function, with eventual harmful properties for the cell or, unfortunately, with 
partial or full wild-type activity (Bhuvanagiri et al., 2010; Karam et al., 2013; 
Kervestin and Jacobson, 2012; Popp and Maquat, 2014; Rebbapragada and 
Lykke-Andersen, 2009; Reznik and Lykke-Andersen, 2010; Schweingruber 
et al., 2013; Silva and Romao, 2009). The efficiency of NMD allows decreasing 
the level of PTC-containing niRNAs to 5—25% of the corresponding wild- 
type mRNA, meaning that a small proportion of PTC-containing mRNAs 
escape from NMD (Kuzmiak and Maquat, 2006). Mainly, such PTC-contain¬ 
ing mRNAs are not translated as it was demonstrated by the absence of de¬ 
tectable protein synthesis from these mRNAs insuring the silencing of PTC- 
containing genes (You et al., 2007). However, rules have always exceptions: for 
instance, truncated HSPllO or p53 proteins synthesized from PTC-containing 
mRNAs have been reported (Dorard et al., 2011;Anczukow et al., 2008). 

Various events can lead to the introduction of a PTC in a specific 
mRNA. Some of them are rare, such as errors leading to either generating 
a nonsense mutation or a frameshift mutation by insertion or deletion dur¬ 
ing DNA replication or transcription. However, the main sources of PTC 
come from splicing events and programmed DNA rearrangements occur¬ 
ring at specific loci, such as the T-ceU receptor or the immunoglobulin 
genes (Fig. 1.4) (Delpy et al., 2004; Green et al., 2003;Wang et al., 2002). 

Since pre-mRNA splicing events are the major source generating PTCs 
and because of the strong links between pre-mRNA splicing and NMD 
(see Section 3.3), a description of this mechanism might be helpful and can 
facilitate the understanding of the processing that leads to the identification 
of PTC during NMD. 

2 PRE-mRNA SPLICING MECHANISM 
2.1 Generalities 

Pre-mRNA splicing is a general maturation process in higher eukary¬ 
otes, since only 700 genes are intronless, out of the 20,000-25,000 hu¬ 
man genes; that is, about 3% of human genes (Busch and Hertel, 2013; 
Lander et al., 2001; Louhichi et al., 2011). Histone, interferon, or 50% of 
G-protein-coupled receptor genes constitute the major examples of intron¬ 
less genes (Louhichi et ah, 2011; Markovic and Challiss, 2009; Shabalina 
et al., 2010). Among the spliced pre-mRNAs, about 95% are also subject to 
alternative splicing, increasing the diversity of protein isoforms generated 
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Figure 1.4 Different sources of PTCs by replication or transcription errors, splicing 
events, or programmed DNA rearrangements. 
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from one pre-mRNA and suggesting the complex regulation occurring 
during mRNA maturation, in order to synthesize the accurate protein iso¬ 
form at the right moment and/or place (see Section 2.3) (Pan et ah, 2008). 

The splicing mechanism removes intronic sequences from a pre-mRNA 
in order to generate an mRNA consisting in exons only (Fig. 1.5). It should 
be kept in mind that not all exonic sequences are coding sequences. For 
example, the 5'’ and the 'i' untranslated region (UTR) are noncoding ex¬ 
onic sequences. Similarly, some intronic sequences can be coding sequences 
when they are retained in the mRNA. Indeed, according to the profile 
of alternative splicing, an intron can become an exon by intron retention 
and an exon can become an intron due to exon skipping (for the different 
categories of alternative splicing, see Fig. 1.8). 

The splicing reaction is ensured by a multiprotein and multi-RNA 
complex, called spliceosome (Galej et ah, 2014; Matera and Wang, 2014; 



Figure 1.5 Constitutive splicing reaction. Two splicing reactions occurring on two con¬ 
secutive introns. The splicing reaction starts with the nucleophile attack of the 2' hy¬ 
droxyl of the branch point on the exon/intron boundary in order to generate a free 5' 
exon and a lariat intron linked to the 3' exon by a first transesterification reaction. The 
second transesterification reaction is made of the nucleophile attack of the 3' hydroxyl 
of the 5' exon on the intron/exon boundary, in orderto generate a free lariat intron and 
two linked exons. The splicing reaction occurs successively and/or simultaneously on all 
exons and introns of the pre-mRNA. Exons are represented by boxes and introns by gray 
lines. Splice sites are symbolized by circles containing "GU"for the 5' splice site (5' ss), 
"AG"for the 3' splice site (3' ss), or"A"forthe branch point (BP). 
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Zhang et ah, 2013). The core spliceosome is composed of 5 small nuclear 
ribonucleoproteins called Ul, U2, U4, U5, and U6, and more than 150 
proteins are involved in the recognition of splice sites (Wahl et ah, 2009). AH 
these components are sequentially loaded on the pre-mRNA, in order to 
play a role at a specific time in the splicing reaction (Fig. 1.6) (Chiou and 
Lynch, 2014). 



Figure 1.6 Sequential loading of spliceosome components. (1) Ul snRNP (Ul) binds to 
the 5' splice site in the E complex. (2) U2 snRNP binds the 3' splice site in the A complex. 
(3) The tri-snRNP U4/U6 and U5 are loaded in the B1 complex. (4) U4 snRNP is released 
and U2 snRNP interacts with U6 snRNP to form the B2 complex. It is in this complex that 
the 2 transesterification reactions occur (5 and 6) to generate the mRNA harboring EJCs 
(7). The UsnRNPs are then recycled. Long black boxes represent exons and thick black 
lines represent introns. 
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The challenge of the spliceosome is to identify exonic sequences and 
intronic sequences inside a polynucleotide sequence. For that, introns are 
bordered by splice sites derived from consensus sequences, which are degen¬ 
erated in superior eukaryotes.The S' splice site follows more or less a “GU- 
RAGU” consensus sequence, where R stands for purine bases (adenosine 
or guanosine). The S' splice site consists in a polypyrimidine tract (cytosine 
or uracil), often ended by a GAG sequence before the start of the exon. 
A third splice element, called branch point, is often located around 20—50 
nucleotides upstream of the S' splice site. This last element is highly degen¬ 
erated, and the only base shared by almost all branch points is an adenosine 
that will initiate the first transesterification reaction (Fig. 1.5). The strength 
of a splice site dictates its probability to be recognized by the spliceosome, 
and is related to its closeness to the consensus splice site sequences. A strong 
splice site would be constitutively recognized by the spliceosome. In con¬ 
trast, the more the splice site sequence wanders from the consensus splice 
site sequence, the weaker is the splice site, meaning there are less chances to 
be recognized by the spliceosome. The notion of strength for the splice site 
is indeed crucial to keep in mind when the therapeutic approach wants to 
modulate splicing in regard to the exon skipping strategy (see Chapter C3). 

Other elements influence the strength of a splice sites, and are called cis- 
and fram-acting regulators. Ch-acting regulators can be divided into enhancers 
or silencers of splicing and, since they can be located in exons or in introns, 
they win be named exonic splicing enhancers (ESE), intronic splicing enhanc¬ 
ers (ISE), exonic splicing silencers (ESS), or intronic splicing silencers (ISS) 
(Fig. 1 .7) .These cis elements are bound by irans-acting factors that either activate 
or inhibit splicing. SR proteins and heterogeneous nuclear ribonucleoproteins 



Figure 1.7 Schematic representation of the cis-and trans-splicing regulators. Introns 
are represented by a thick black line and the exon is symbolized by a rectangle. SR pro¬ 
teins are noted by"SR."U1 snRNP (Ul), splicing factor 1, and U2 auxiliary factor {U2AF) of 
65 and 35 kDa are shown at the 5' splice site (ss) and 3' ss, respectively. Intronic splicing 
silencer (ISS), exonic splicing enhancer (ESE), exonic splicing silencer (ESS), and intronic 
splicing enhancer (ISE) are mentioned. On this model, SR proteins play the role of activa¬ 
tors of splicing (-t) when hnRNP are inhibitors of splicing (-). 
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(hnRNP) are two families of proteins that often antagonize for splicing activity 
(Caceres et al., 1994; Han et al., 2005; Zahler et al., 2004). The family of SR 
proteins consists of 12 members, named from SRSFl to SRSF12, previously 
called ASF/SF2, SC35, SRP20, SRp75, SRp40, SRp55,9G8, SRp46, SRp30c, 
SRp38, SRp54, and SRrp35, respectively (Fu, 1995;Manley and Krainer, 2010) . 
SR protein members share structural and functional features, such as the pres¬ 
ence of an RNA binding domain at the N-terminal part of the protein, an 
arginine/serine (RS) rich domain at the C-terminal part of the protein, and 
they are splicing factors with a positive or negative effect on splicing, depending 
on their binding sites (Bourgeois et al., 2004; Zhou and Fu, 2013). The second 
family often involved in the regulation of splice site use is called hnRNP. This 
family is composed of at least 20 members, named from hnRNP A to hnRNP 
U (Dreyfuss et al., 1993; Pinol-Roma et al., 1988). These proteins contain one 
or more RNA binding domains, a protein/protein interaction domain, a cel¬ 
lular localization domain, and a functional domain. At the origin, such pro¬ 
teins have been identified as proteins interacting with the pre-mRNA, also 
called heterogeneous nuclear RNA (hnRNA), explaining why they are called 
hnRNPs.They are involved in many different functions, from transcription to 
translation.They are often categorized as splicing inhibitors, even though a posi¬ 
tive effect on splicing has been reported for some members of the hnRNP fam¬ 
ily, such as hnRNP G—T (Hui et al., 2003; Hung et al., 2008; Liu et al., 2009). 

Interestingly, the strength of a splice site is actually dictated by the com¬ 
bination of the sequence of the splice site itself, the presence of cfr-activator 
or inhibitor elements, and the proteins that recognize these cis elements. 
Indeed, a consensus splice site sequence can become a weak splice site if 
cis- or fraws-inhibitor elements are located in the vicinity, and regulate it. In 
contrast, a splice site sequence different from the consensus sequence can be 
a strong splice site, if positive cis and tmns elements regulate it. In addition, 
the strength of a splice site can be modulated according to the cell-type, 
the tissue, or various stimuli, meaning that, according to the physiological 
condition, a splice site can be recognized or ignored by the spliceosome 
(Coelho and Smith, 2014; Fu and Ares, 2014; Lee and Rio, 2015). 

2.2 Categories of Alternative Splicing 

Alternative splicing affects more than 90% of multiexonic pre-mRNAs in 
human (Blencowe, 2006; Johnson et al., 2003; Wang et al, 2008), demon¬ 
strating the almost universality of this process among human genes. Thanks 
to this mechanism, one gene can encode several proteins with various 
functions, tissue specificity and/or different regulation, explaining how the 
limited number of genes in human (about 25,000) can take charge of aU the 
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necessary cellular functions, estimated at 90,000 (Roy et al., 2013; Venter 
et al., 2001). Interestingly, the kinetic of removing introns subject to alterna¬ 
tive splicing is slower than the kinetic to remove a constitutive intron, suggest¬ 
ing an additional level of the regulation for the expression of genes (Khodor 
et al., 2012; Pandya-Jones et ah, 2013; Pandya-Jones and Black, 2009;Vargas 
et ah, 2011). Such parameter could be interesting to keep in mind when a 
therapeutic strategy involving an action on splicing is designed. 

Different categories of alternative splicing have been described (Fig. 1.8) 
(Roy et al., 2013;Wagner and Berglund, 2014). The most frequent alternative 
splicing event in human is the exon skipping, consisting in the no recognition 



Figure 1.8 The five main alternative splicing categories. Exons are represented by a box 
and introns by a black horizontal line. For each category, the different splicing reactions 
are symbolized by a red line. The red star indicates the result of alternative splicing for 
exon skipping and intron retention. For the alternative 5' ss or 3' ss, the use of the up¬ 
stream 5' ss or the downstream 3' ss, generates a shorter exon 1 or exon 2, indicated as 
El s or E2s, respectively. 
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of the 3' and 5' splice sites surrounding an exon, and one of the less frequent 
alternative splicing categories is the intron retention, which is the no rec¬ 
ognition and use of the 5' and 3' splice site of an intron by the spliceosome 
(Sammeth et ah, 2008; Vitting-Seerup et al, 2014). Other alternative splicing 
categories include the competition between several 5' or 3' splice sites for the 
splicing of the same pre-mRNA region. The last category of alternative splic¬ 
ing is a competition between two exons to be included in an niRNA. This 
alternative splicing event, called mutually exclusive exons, induces the exclu¬ 
sion of one exon from the mRNA, when the competitor exon is included. 
This last category is the rarest alternative splicing category found in human 
(Vitting-Seerup et ah, 2014). The consequences of an alternative splicing are 
the deletion or the insertion of a nucleic acid sequence that might modify the 
protein sequence encoded by the gene. Alternative splicing is used in particu¬ 
lar to introduce or remove a protein domain or a regulator element, allowing 
the change of the corresponding protein or a modification in its expression. 

2.3 Regulation of Splicing 

Splicing is a very crucial step in the maturation of mRNA, since the shift 
of one nucleotide in the recognition of a splice site might lead to a frame- 
shift mutation, with the appearance of a PTC most of the times. We saw 
that alternative splicing can generate several proteins with various functions 
from one pre-mRNA. Splicing, and even more alternative splicing, are 
tightly controlled processings in order to deliver the accurate mRNA at 
the right time. For this reason, cis- and trans-acting elements play a crucial 
role involving secondary RNA structure, additional trans factors, or a spe¬ 
cific combination of trans factors (House and Lynch, 2008). AH these regula¬ 
tory elements switch between “on” or “off” activity, according to internal 
and/or external stimuli, in order to adapt the splicing profile of the cell, and 
generate the accurate gene expression pattern (Kalsotra and Cooper, 2011). 

To illustrate the complexity of the regulation occurring for alternative 
splicing reactions, we will introduce one example of alternative splicing that has 
been deeply studied for many years.This example is about the alternative splic¬ 
ing occurring on the exon N1 of the cellular SRC kinase (c-src) pre-mRNA. 

C-src gene encodes a tyrosine kinase expressed in neuronal and non¬ 
neuronal cells. Between the exon 3 and the exon 4, an 18-nucleotide se¬ 
quence is recognized as an exon, called exon Nl, only in nervous cells.The 
exon Nl codes for an SH3-type domain that affects the protein—protein 
interaction capacity of the c-src factor. This tissue-specificity involves 
a'5-acting elements, upstream and downstream of the exon Nl (Fig. 1.9), 
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Nonneuronal cells 



I 



Neuronal cells 



Figure 1.9 Alternative splicing regulation of the exon Ni in nonneuronal (upper panel) 
or in neuronal cells (lower panel). Exons are represented by dark blue cylinders and 
introns are symbolized by a thick dark blue line. The exons 3 {E3), Nl (Nl), and 4 (E4) 
only from c-src pre-mRNA are represented. U1 snRNP is symbolized by a bicolor form 
harboring "U1."Other proteins are mentioned with their name on each model. The DCS 
is symbolized by a light blue rectangle. The binding sequences for PTB or nPTB are rep¬ 
resented by an orange box. 
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and a combination of proteins involved in the splicing of the exon N1, and 
with a composition depending on the cell type. In aU cell-types, KH-type 
splicing regulatory protein (KSRP), Fuse binding protein (FBP), hnRNP 
H, and hnRNP F bind the downstream regulator sequence called DCS for 
downstream control sequence (Black, 1992; Chan and Black, 1995; Chou 
et ak, 1999;Min et ah, 1995,1997). The polypyrimidine tract binding (PTB) 
protein also interacts with the DCS, and with the upstream ds-regulatory 
element, in order to repress the recognition of the exon N1 by the spliceo- 
some. Interestingly, PTB is not expressed in neurons, and another protein is 
specifically expressed in neurons, and replaces PTB by interacting with the 
upstream ris-regulatory element and the DCS. This protein, called nPTB 
(neural PTB) (Ashiya and Grabowski, 1997; Markovtsov et ak, 2000) binds 
to the CIS-regulatory elements and, unlike PTB, helps the recognition of the 
exon N1 by the spliceosome, and in particular the recognition of the 5'’ 
splice site by the U1 snRNP. The consequence is the inclusion of the exon 
N1 in neuron cells, and not in other ceU-types. 

In addition to requiring intronic splicing elements, some exonic splicing 
regulators have been identified in the exon N1 (Rooke et ak, 2003). This 
study showed that the exon N1 is bound by the SR protein SRSFl and the 
hnRNP Al, hnRNP F, and hnRNP H. Interestingly, SRSFl stimulates the 
recognition of the exon N1 by the spliceosome, when hnRNP Al inhibits 
the splicing of N1. 

The model of c-src shows the complexity of the alternative splicing 
regulation by involving intronic and exonic splicing regulators that can 
be activators or inhibitors of splicing, according to the proteins that bind 
them.This model also exemplifies the antagonistic effect of SR proteins and 
hnRNP factors (in particular hnRNP Al) and the involvement of a tissue 
specific splicing factor (nPTB). 

2.4 Pathologies Associated with Splicing Defaults 

Diseases related to a splicing default can occur by two different ways.The first 
one is about mutations affecting directly a splicing factor, which impairs spe¬ 
cific or general splicing reactions. Often, such mutation is lethal, illustrating 
the dependence of cells for splicing. Indeed, splicing factors U2AF35 or SF3b 
subunit 4 have been shown to be essential during the early development stag¬ 
es of the zebrafish, after a screening using insertional mutagenesis inactivating 
these genes (GoUing et al., 2002). Although that situation is rare, some pa¬ 
thologies have been shown to be related with the level of expression of splic¬ 
ing factors. For example, the SR protein SRSF3 (SRp20) has been shown to 
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be overexpressed in different colorectal cancer cell lines (SW480, HT29, or 
DLDl) (Goncalves et al., 2008), and in many cancers (He et al., 2004, 2011; 
Jia et al., 2010). In contrast, downregulation of SRSF3 slows down the cell 
growth (He et al., 2011). The consequence of the modulation of the SRSF3 
expression is a modification of the cellular splicing pattern. Among the targets 
of SRSF3,the pre-mRNA encoding the transmembrane receptor CD44 can 
be found (Galiana-Arnoux et al., 2003). Indeed, this transcript is subject to 
alternative splicing on several variable exons, named from v2 to vlO, located 
between exon 5 and exon 16, as well as the exon 18 and the exon 19 that 
are mutually exclusive. Alternative splicing on this transcript generates RNA 
isoforms with different properties and, in particular according to the variable 
included exon, CD44 gains some metastatic properties, such as the isoforms 
with v8 to vlO (Yae et al., 2012). Interestingly, ESE have been found in these 
exons, and respond to SR proteins SRSFl, SRSF3, and SRSF7, leading to a 
retention of these exons in CD44 niRNA, when the level of these SR pro¬ 
teins increase (Galiana-Arnoux et al., 2003; Goncalves et al., 2008). Clearly, 
the level of some SR proteins influences, at least partially, the process of tu- 
morigenesis by promoting the synthesis of particular CD44 mRNA isoforms. 

SRSF3 is not the only splicing factor to have oncogenic property, since 
it has also been demonstrated for SRSFl, SRSF6, SRSF9, hnRNP A2/ 
Bl, and hnRNP H (Cohen-Eliav et al., 2013; Fu et al., 2013; Golan-Gerstl 
et al., 2011; Kami et al., 2007; Lefave et al., 2011). In contrast, some splicing 
factors have been shown to behave as tumor suppressors, such as RBM5, 
RBM6, and RBMIO (Bechara et al., 2013).A balance between tumorigenic 
and apoptotic splicing factors is essential to maintain a global splicing profile 
of the healthy cell. 

Another pathology has been shown to be related with mutations in 
the splicing machinery. Indeed, some patients with retinitis pigmentosa 
carry mutations in the splicing factors PRPF31/U4-61k or PRP8 (Boon 
et al., 2007;Vithana et al., 2001; Wilkie et al., 2008). Several mutations in 
the splicing factor PRPF31 component of the U4 snRNP have been re¬ 
ported to cause retinitis pigmentosa. In particular, the missense mutation 
A216P leads to stabilize the interaction between PRPF31 and PRPF6, a 
component of the U5 snRNP. This stabilization promotes an inhibition 
of splicing, likely by preventing the disassembling of U4 and U5 snRNPs, 
in order to be recycled for a new splicing reaction. In the case of muta¬ 
tions affecting PRP8 and found in retinitis pigmentosa, another mechanism 
is involved. PRP8 is a component of the U5 snRNP and interacts with 
Brr2, a component that joins U5 snRNP during the maturation of this 


18 


Nonsense Mutation Correction in Human Diseases 


latter. Mutations found in patients with retinitis pigmentosa are located in 
the highly conserved C-terminal part of PRPS.The consequence of those 
mutations is a loss of interaction between PRP8 and Brr2, leading to an ac¬ 
cumulation of the immature form of the U5 snRNP, and then an inhibition 
of splicing (Boon et ah, 2007; Pena et ah, 2007). 

Although few cases of pathologies have been described to involve mu¬ 
tations in the splicing machinery, such mutations are often thought to be 
lethal. In most of pathologies related with a splicing failure, mutations affect 
one splicing reaction on a specific pre-mRNA, rather than the functionality 
of a general splicing factor. Examples describing mutations affecting either 
a splice site or a regulatory element are numerous in the literature. One 
example to illustrate how a specific splicing default can be at the origin of 



Figure 1.10 Influence of a nonsense mutation in the exon 18 of the BRCA1 gene. For the 
wild-type gene (A), SRSFl interacts with an ESE to promote the splicing of introns 17 
and 18 and to generate a wild-type BRCAl mRNA, including the exon 18. When a G—>T 
transversion mutation occurs 
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Figure 1.10 {cont.) (B), it generates a nonsense mutation and abolishes the recognition 
of the ESE by SRSFl, leading to the skipping of the exon 18 and a nonfunctional inter¬ 
nally truncated BRCAl protein. 


a pathology concerns the BRCAl gene. Mutations in the tumor suppressor 
gene BRCAl are often found in hereditary breast and ovarian cancers (Miki 
et al., 1994). Among the mutations frequently found, the point mutation 
G6^T in the exon 18 changes a glutamic acid into a stop codon, and leads 
to the inactivation of an ESE bound by SRSFl (Liu et ah, 2001) (Fig. 1.10). 
Such mutation has two consequences: the first is the skipping of the exon 
18, due to the inactivation of the ESE, generating an internally truncated 
BRCAl missing 26 amino acids (D192^F1717) (Mazoyer et al., 1998). 
These 26 amino acids belong to a highly conserved region of the C-terminal 
domain of BRCAl (BRCT), likely impairing the function of the BRCAl 
protein,since this domain is involved in protein interactions (Yu et al., 2003). 
The second consequence is the decay by NMD of the PTC-containing 
BRCAl mRNA isoform. In any case, the wild-type BRCAl protein is not 
produced, leading to a higher susceptibility for the development of cancers. 

After the mechanism of splicing, we can now see in detail the recognition 
of PTCs and the nonsense-mediated mRNA decay, in particular. Although 
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this mechanism has been discovered more than 35 years ago in yeast (Losson 
and Lacroute, 1979), and soon after in human (Maquat et al., 1981), and was 
thoroughly studied from the early 1990s, the model for the recognition of 
PTC and the decay of the mRNA is stiU unachieved, and is constantiy updated. 


3 NONSENSE-MEDIATED mRNA DECAY (NMD) MECHANISM 
3.1 Generalities 

NMD is an mRNA surveillance mechanism that specifically recognizes and 
degrades mRNAs harboring a PTC, in order to prevent the synthesis of a 
truncated protein that would be not functional, or could carry a deleteri¬ 
ous activity for the cell. However, NMD also degrades mRNAs harboring 
a PTC that would lead to the synthesis of a partially or totally functional 
truncated protein (Bhuvanagiri et al, 2010; Chang et al., 2007; Conti and 
Izaurralde, 2005; Kervestin and Jacobson, 2012; Kuchino and Muramat- 
su, 1996; Lejeune and Maquat, 2005; Popp and Maquat, 2014; Rebbapragada 
and Lykke-Andersen, 2009; Silva and Romao, 2009). The NMD mechanism 
occurs in the cytoplasm, consistent with the involvement of translation ma¬ 
chinery in PTC recognition (Singh et al., 2007) during the pioneer round of 
translation or first round of translation (Ishigaki et al., 2001) and the follow¬ 
ing rounds of translation (Durand and Lykke-Andersen, 2013; Rufener and 
Miihlemann, 2013). Since, the mechanism of NMD can be different accord¬ 
ing to the species, we will describe here how NMD occurs in human cells. 

The first round of translation occurs on a specific niRNP carrying 
CBP80 and CBP20 on the cap structure at the 5' end of the mRNA, the 
poly(A) binding protein Cl (PABPCl) and N1 on the poly(A) tail at the 
3' end of the mRNA, and EJCs distributed upstream of exon—exon junc¬ 
tions (Chiu et al., 2004; Hosoda et al., 2006; Ishigaki et al., 2001; Lejeune 
et al., 2002, 2004; Sato et al., 2008). In the other rounds of translation that 
support the bulk of translation, CBP80/20 is replaced by eIF4E, PABPCl 
is the only PABP present on the poly (A) tail, and no EJCs are present any 
longer on the mRNA (Fig. 1.11). 

In mammalian cells, two models attempt to explain how NMD can dis¬ 
tinguish between PTCs and normal termination codons.The historically first 
model proposes that a PTC is defined as a translation termination codon, locat¬ 
ed more than 50—55 nucleotides upstream of at least one exon—exon junction. 
This model suggests a strong link between pre-mRNA splicing and NMD. 
Indeed, a protein complex called exon junction complex (EJC) is deposited 
20—24 nucleotides upstream of aU exon-exon junctions, as a consequence of 
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Figure 1.11 Consequences of the first (pioneer) round of transiation occurring on wiid- 
type mRNA and on PTC-containingmRNA. The fWst round of translation occurs on mRNA 
carrying EJCs, CBP80/CBP20 on the 5' end, and a mix of the polyfA) binding protein Cl 
(PABPCl; Cl) and the poly(A) binding protein N1 (PABPNl; Nl) on the poly(A) tail at 
the 3' end. On a wild-type mRNA (left part), EJCs, CBP80/CBP20, and PABPNl will be 
removed from the mRNA after the pioneer round of translation and elF4E will replace 
CBP80/CBP20 on the CAP structure, in order to support the bulk of translation. On a 
PTC-containing mRNA (right part), the mRNA will still carry at least one EJC, CBP80/ 
CBP20, and the PABPNl PABPCl, in order to promote NMD. 


splicing (Le Hit et al., 2000a,b, 2001; Le Hir and Seraphin, 2008). The sec¬ 
ond model proposes that NMD is activated when the distance between the 
stop codon and the cytoplasmic PABPCl is abnormally long (Behm-Ansmant 
et al., 2007; Eberle et al., 2008; Fatscher et al., 2014; Silva et al., 2008; Singh 
et al., 2008). This model necessitates a molecular mechanism measuring this 
distance and determining whether the length is normal or not. 

NMD requires dozen of central proteins in order to identify a PTC 
from a normal termination codon. These proteins are called Up frameshift 
(UPF) proteins, after their identification in the yeast Saccharomyces cerevisiae 
(Culbertson et al., 1980; Leeds et al., 1991,1992), or suppressors with mor¬ 
phogenetic defaults of genitalia (SMG) named like that after their identifi¬ 
cation in Caenorhabditis elegans (Cali et al., 1999; Hodgkin et al., 1989) (see 
Table 1.3 for the correspondence). The exact rule of these proteins in the 
mechanism of NMD stiU remains unclear, even though aU these proteins 
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have been intensively studied and characterized in order to generate struc¬ 
tural, cellular, and functional data that we are going to review. 

3.2 Main proteins involved in NMD 
3.2.1 UPF1/RENT1/SMG2 

The UPFl gene is carried by the chromosome 19 and encodes a 123 kDa pro¬ 
tein with several functional domains, such as a CH-domain, a helicase domain, 
and an S/Q domain (Fig. 1.12). An ultraconserved domain is present at the 
N-terminal part of the protein and this domain has been recently shown to be 
cleaved during apoptosis by caspases and promoting apoptosis (Jia et ah, 2015; 
Popp and Maquat, 201 5). The role of the ATP-dependent hehcase activity in 
NMD has been studied for a while, and it started with the discovery that a 
mutant version of UPFl impairing the hehcase domain (after the mutation of 
the arginine at the position 843 into a cysteine: R843C) leads to a dominant 
negative protein inhibiting NMD (Sun et ah, 1998). It is only recently that the 
function of this domain in NMD has been clarified (see next paragraph). 

UPFl interacts with a wide net of proteins (Varsally and Brogna, 2012), 
so we wiU concentrate only on proteins with demonstrated role in NMD. 
Prior to being recruited to the EJC, UPFl first interacts with the cap bind¬ 
ing protein CBP80 via its helicase domain (Hwang et ah, 2010). This inter¬ 
action promotes the recruitment of another partner of UPFl, the protein 
kinase SMGl, associated with its partners, the proteins SMG8 and SMG9, 
via the SQ domain. In addition, prior to being recruited to the mRNA, 
UPFl also interacts with the release factors 1 and 3, in a complex caUed 



Figure 1.12 Schematic representation of UPFl and its functional domains. Amino 
acid positions are indicated at the top. The N-terminal conserved region (NCR); the 
cysteine-histidine-rich (CH) domain; the helicase domain with the RecAl, RecA2, Red B, 
and RecIC motifs; and the serine-glutamine-rich (SQ) domain, including the recogni¬ 
tion and phosphorylation SAT-Q motifs are indicated. 
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SURF (for SMGl-UPFl-eRFl and eRF3 complex) that works as a transla¬ 
tion termination complex (Ivanov et al., 2008; Kashima et al., 2006). UPF2 
and UPF3X then recruit UPFl (in the SURF complex) to the EJC, via 
its CH domain at least, and will stimulate its helicase activity (Chakrabarti 
et al., 2011; Chamieh et al., 2008) in order to form the DECID (DECay 
InDucing) complex (Kashima et al., 2006). The interaction of UPFl with 
UPF2 modifies the spatial conformation of UPFl that becomes active for 
its 5'’—3'’ unwinding activity, thanks to its helicase domain. Interestingly, the 
helicase activity of UPFl is tightly controlled since the N-terminal domain 
of UPFl (CH domain), as well as the C-terminal domain (SQ domain), 
influences negatively the helicase activity. It is only when UPFl engages 
interactions by its CH and SQ domains with multipartners that the helicase 
activity is released (Fiorini et al., 2013). Once the ATPase/helicase activity 
is unlocked, UPFl will displace proteins bound to the downstream part of 
the PTC, in order to allow its fast degradation (Franks et al., 2010). 

Recently, UPFl has been demonstrated to bind niRNAs independently 
to the EJC, aU over the ORF and on the 3' UTR. Interestingly, when ribo¬ 
somes read the ORF, they displace UPFl present on the ORF resulting in 
a concentration of UPFl in the 3' UTR (Hogg and Goff, 2010; Kurosaki 
and Maquat,2013; Zund et al., 201 3). These data are in opposition to previ¬ 
ous works showing that NMD is activated on an niRNA by tethering UPF 
proteins, such as UPFl, downstream of a physiological stop codon (Gehring 
et al., 2003; Gonzalez-Hilarion et al., 2012; Lykke-Andersen et al., 2000). 
However, its apparent opposition might be linked to the stability of the 
interaction between the UPF protein and the 3' UTR: in the case of the 
tethering assay, the interaction might be more stable than in the case of the 
native UPFl protein bound to the 3' UTR. 

UPFl distributes homogenously in the cytoplasm, but shuttles between 
cytoplasm and nucleus (Lykke-Andersen et al., 2000; Mendell et al., 2002; 
Serin et al., 2001). The cytoplasmic fraction plays a role at least in the NMD, 
as described earlier. The nuclear fraction of UPFl plays various functions, in 
addition of NMD (Imamachi et al., 2012; Varsally and Brogna, 2012). In¬ 
deed, UPFl is involved in the maintenance of the telomeres, and in particu¬ 
lar in the synthesis of the leading strand of telomeres (Azzalin et al., 2007; 
Chawla et al., 2011). Another function of UPFl in the nucleus is dur¬ 
ing DNA replication, since the silencing of the UPFl gene leads to a cell 
cycle arrest at the S-phase (Azzalin and Lingner, 2006b). Consistent with 
this function, UPFl interacts with the DNA polymerase 8 in an RNA- 
independent manner (Azzalin and Lingner, 2006b; Carastro et al., 2002). 
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This interaction raises the possibility that the helicase domain of UPFl 
plays a role in the replication fork progression. 

Based on the fact that UPFl is predominantly cytoplasmic and only 
a small proportion of UPFl resides in the nucleus, it is possible that the 
nuclear UPFl has only a regulation role in DNA replication, telomere 
maintenance, or other yet unidentified nuclear functions. In addition, it 
is also a possibility that the nuclear UPFl already plays a role in NMD by 
binding the 'i' UTR before other proteins cover that region of the mRNA, 
since UPFl concentrates on 'i' UTR (Hogg and Goff, 2010; Kurosaki and 
Maquat, 2013; Zund et ah, 2013). This binding of UPFl and in particular 
the phosphoisoform of UPFl in the 3'’ UTR of mRNAs could play dif¬ 
ferent functions. For instance, phospho-UPFl has been shown to interact 
with the stem loop binding protein (SLBP) that binds to the stem loop 
present at the 'i' end of histone mRNAs. SLBP is responsible for the regula¬ 
tion of the translation and the stability of histone mRNAs. This responsibil¬ 
ity could be achieved via its interaction with UPFl that would next recruit 
SMG5 and proline-rich nuclear receptor coactivator 2 (PNRC2) protein in 
order to induce the mRNA decay (Fig. 1.13) (Choe et ah, 2014). 

3 . 2.2 UPF2/RENT2/SMG3 

UPF2 is a predominantly cytoplasmic protein that concentrates around the 
nucleus (Lykke-Andersen et al., 2000; Serin et al., 2001). A nuclear localiza¬ 
tion sequence (NLS) is present in the N-terminal region of UPF2, suggesting 
that this protein can also go to the nucleus (Serin et al., 2001). Human UPF2 
gene is located on the chromosome 10 and encodes for a protein of about 
150 kDa. Several functional domains have been found in UPF2, and in par¬ 
ticular three middle domains of eIF4G (MIF4G) in the N-terminal and in the 
central region of UPF2, and a UPFl interacting domain in the C-terminal part 
(Fig. 1.14). UPF2 interacts with UPFl and with UPF3/UPF3a or UPF3X/ 
UPF3b in particular, to activate both the ATPase and the helicase function of 
UPFl (Chamieh et al., 2008). Interestingly, UPF2 or UPF2 associated with 
UPF3X/UPF3b, binds RNA at least in vitro, but the functional relevance of 
that activity remains to be clarified (Kadlec et al, 2004) . UPF2,like UPFl, is a 
phosphoprotein (Chiu et al., 2003) whose involvement in phosphorylation in 
crucial interactions with other NMD factors has been shown only in the yeast 
S. cerevisiae, but not yet in mammals (Wang et al., 2006). 

The role ofUPF2 in NMD remains incompletely characterized, according 
to the limited functions in the NMD mechanism of this protein identi¬ 
fied up to the present time: interaction with UPFl and UPF3/UPF3a or 
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^ Figure 1.13 Histone mRNA decay pathway. During histone mRNA translation, SLBP pro¬ 
tein interacts with a stem loop located in the 3' UTR of the histone mRNA. SLBP interacts 
with CBP80 to circularize the mRNA and promote an efficient translation. In addition, 
SLBP interacts with SLBP interacting protein 1 (SLIPl) and the CBP80/CBP20-dependent 
translation initiation factor (CTIF). Under conditions requiring the silencing of histone 
genes, such as the inhibition of the DNA replication, UPFl is recruited on the ribosome 
pausing on the physiological stop codon and becoming hyperphosphorylated by ATR 
and/or DNA-PK proteins. UPFl hyperphosphorylated interacts with SLBP, inducing the 
leaving of CTIF. Then SMG5 and PNRC2 are recruited by UPFl which induces the decap¬ 
ping of the 5' end of the histone mRNA by helping in the recruitment of the decap¬ 
ping complex Dcpl /Dcp2 and the recruitment of 5'-3' exoribonuclease activity. Due to 
the unprotected 3' end of the mRNA, the exosome is loaded and degrades mRNA from 
the 3' end to the 5' end. An alternative and/or concomitant way is the recruitment of a 
TUTase by hyperphosphorylated UPFl, in order to induce the addition of a poly uridine 
tail at the 3' end of the mRNA; this will recruit the Lsml-7 complex. That complex then 
helps in the recruitment of the decapping enzymes Dcp1/Dcp2 at the 5' end, and the 3' 
to 5' decay enzymes at the 3' end of the histone mRNA. 

UPF3X/UPF3b, and activation of the helicase activity of UPFl. In addition, 
UPF2 seems not to be absolutely required for some NMD reactions, ac¬ 
cording to the composition of the EJC (Gehring et al., 2005), suggesting 
that several different pathways can activate NMD. However, the involvement 
of UPF2 in NMD might be more predominant than originally thought, 
since it is one of the UPF proteins with UPFl, unlike UPF3X/UPF3b, to 
be cleaved by caspases during apoptosis (Jia et al., 201 5). The fact that UPF2 
is targeted by caspases in order to shut down NMD during apoptosis sug¬ 
gests a central role of that protein in NMD and/or an involvement in other 
pathways that has to be blocked to allow cell death progression. The UPF2 
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Figure 1.14 Schematic representation of UPF2 functional domains. The middle domain 
of elF4G [MIF4G (1-3)] and UPFl binding domain (UIBD) are represented. The amino acid 
positions are indicated at the top.The interacting surfaces with UPF3 and UPFl are shown 
by a thick black line at the top. Finally, the linker domains 1-3 are shown (LR1-LR3). 
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caspase-cleavage fragments promote new activities, providing new informa¬ 
tion on the functional properties of the UPF2 domains. The N-terminal 
fragment, for instance, shows capacity at promoting apoptosis and at inhib¬ 
iting NMD, when it is overexpressed in cells (Jia et ah, 2015). The NMD 
inhibition by the N-terminal fragment could suggest that this part of UPF2 
is involved in interactions with other NMD factors, but is unable to assume 
the function of UPF2 in NMD, making this fragment a dominant nega¬ 
tive. In support of this statement, a highly conserved N-terminal region has 
been found in yeast UPF2, in which mutations impair with NMD (Fourati 
et al., 2014). The C-terminal part has been shown to interact with UPFl 
and SMGl. This C-terminal part alone retains the function of UPF2 in 
NMD, explaining why an overexpression of this fragment does not interfere 
with NMD (Clerici et ah, 2013;Jia et ah, 2015; Kadlec et ah, 2004). UPF2 
was thought to be simply the bridge between UPFl and UPF3/UPF3a or 
UPF3X/UPF3b bound to the EJC (Kashima et ah, 2006). However, this 
model starts to be challenged, since UPF2 could stimulate the endonucleo¬ 
lytic cleavage activity of SMG6 in an EJC-independent way, meaning that 
UPF2 is not necessarily recruited to the EJC in order to function in NMD 
(Boehm et al., 2014). It also means that a bridge between UPFl and UPF3/ 
UPF3a or UPF3X/UPF3b is not absolutely required for NMD. Support¬ 
ing this vision, electron microscopy study of complexes, including SMGl, 
SMG8, SMG9, UPFl, and UPF2, shows that UPF2 can be recruited to the 
complex formed by SMGl/8/9 and UPFl, without the presence of the EJC 
or UPF3/UPF3a or UPF3X/UPF3b (Melero et al, 2014). Overall, these 
data suggest that the role of UPF2 in NMD is not limited to promoting the 
interaction between UPFl and UPF3/UPF3a or UPF3X/UPF3b. 

3.2.3 UPF3/UPF3a/Rent3A 

UPF3 (also called UPF3a) gene is carried by the chromosome 13 in hu¬ 
man cells and encodes a protein of about 52 kDa that mainly localizes to 
the nucleus, even though it can shuttle between the cytoplasm and the 
nucleus (Lykke-Andersen et al., 2000; Serin et ah, 2001) (Fig. 1.15). UPF3/ 
UPF3a is a paralogous gene to UPF3X (also called UPF3b), sharing 60% 
of similarity and 42% of identity (Serin et ah, 2001), and with overlapping 
functions. UPF3/UPF3a NMD factor is the only NMD factor to have a 
paralogous gene. An explanation could be related to the location of the 
UPF3X/UPF3b gene on the chromosome X. Indeed, many genes located 
on chromosome X have a paralog on an autosomal chromosome, since 
chromosome X is inactivated during spermatogenesis and some functions 
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Figure 1.15 Schematic representation ofUPF3 and its functional domains. Amino acid 
positions are indicated at the top. The NES, the RRM, and the NLS are shown. The inter¬ 
acting surfaces with UPF2 and MAGOH/Y14are indicated by a thick black line at the top. 

carried by the chromosome X are still needed during spermatogenesis 
(Wang, 2004). UPF3/UPF3a and UPF3X/UPF3b have been shown to in¬ 
teract with EJC components Y14, Ref/Aly, RNPSl, and SRmlhO in an 
RNase insensitive way (Kim et ah, 2001; Lejeune et ah, 2002), and UPFl 
and UPF2 (Chan et ah, 2009; Lejeune et ah, 2002,2003) . Although UPF3/ 
UPF3a is less efficient than UPF3X/UPF3b in promoting NMD (Gehring 
et ah, 2005; Lykke-Andersen et ah, 2000), UPF3/UPF3a can compensate a 
decrease in the level of UPF3X/UPF3b (Chan et ah, 2009). However, natu¬ 
ral substrates of NMD are not indifferentiaUy regulated by UPF3/UPF3a 
or UPF3X/UPF3b, since some mRNAs upregulated in the absence of 
UPF3X/UPF3b are not affected by a downregulation of UPF3/UPF3a, 
such as GADD45B,ATF3, ASNS, or MAFF genes (Chan et ah, 2007). Due 
to the low involvement of UPF3/UPF3a in NMD, UPF3X/UPF3b has 
been more studied than UPF3/UPF3a in the recognition of PTC and in 
the NMD mechanism. 

3.2.4 UPF3X/UPF3b/Rent3B 

The UPF3X/UPF3b gene is located on the chromosome X and encodes a 
protein of about 58 kDa (Lykke-Andersen et ak, 2000; Serin et ah, 2001). 
UPF3/UPF3a and UPF3X/UPF3b are paralogs, and share the yeast UPF3 
gene as their common ancestor gene. UPF3X/UPF3b is a mainly nuclear 
protein with the capacity to shuttle between the nucleus and the cytoplasm, 
thanks to the presence of four nuclear localization sequences (NLS) located 
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Figure 1.16 Schematic representation of UPF3X/UPF3b and its functional domains. 
The NES, NLS, and the RRM are represented. The interaction surfaces with UPF2 and 
MAGOH/Y14 are shown. The amino acid position is indicated at the top. 


in the middle and in the C-terminal part of the protein, and one nuclear 
export sequence (NES) located at the N-terminal part of the protein. 
UPF3X/UPF3b has also several other clearly identified domains, such as an 
RRM (RNA recognition motif) and some interacting domains with UPF2 
and MAGOH/Y14 (Fig. 1.16). 

According to the current model, UPF3X/UPF3b is thought to be the 
first NMD factor to be loaded on the mRNA, before the export of the 
mRNA to the cytoplasm (Le Hir et ah, 2001). The recruitment of UPF3X/ 
UPF3b on the mRNA is operated by the EJC via its C-terminal domain 
interacting with three proteins of the core EJC (eIF4AIII,Y14, and MA- 
COH) (Buchwald et ah, 2010; Cehring et ah, 2003). UPF3X/UPF3b also 
interacts with UPF2 via its N-terminal domain. Although downregulation 
of UPF3X/UPF3b impairs many NMD reactions (Chan et ah, 2009), the 
role of UPF3X/UPF3b in NMD is not clearly defined, except in bridg¬ 
ing UPF2 to the EJC and stimulating the helicase activity of UPFl (see 
the section on UPFl). In addition, its role in NMD is not absolutely re¬ 
quired, since a downregulation of UPF3X/UPF3b is partially compensated 
by UPF3/UPF3a (Chan et ah, 2009). Moreover, some NMD reactions, such 
as on theT-ceU receptor mRNA, occur in the absence of UPF3/UPF3a and 
UPF3X/UPF3b (Chan et ah, 2007), consistent with the idea that several 
pathways can activate NMD. 
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3.2.5 Suppressor of Morphogenesis in Genitalia 1 (SMG1)/ATX/ 
Lambda-Iota Protein Kinase C-Interacting Protein (LIP) 

SMGl is a big protein with a molecular weight of 410 kDa and a member 
of the phosphatidylinositol 3-kinase (PI3K)-related protein kinase (PIKK) 
family encoded by a gene located on the chromosome 16 (Fig. 1.17). Its 
role in NMD is to phosphorylate UPFl when the ribosome is posing at a 
PTC and when UPFl interacts with UPF2 and/or UPF3/UPF3X proteins. 
SMGl phosphorylates the threonine 28 at the N-terminal end of UPFl 
and several serines at the C-terminal end of UPFl, and in particular the 
serine 1096 that serves as anchor to SMG6 and SMG5-SMG7, respectively 
(Okada-Katsuhata et ah, 2011). Since preventing the interaction between 
SMGl and UPFl impairs NMD (Hu et ah, 2013), this interaction became 
an interesting target to identify inhibitors of NMD by looking for mol¬ 
ecules capable of interfering with this interaction (Usuki et ah, 2004). 

SMGl interacts with UPFl independently of the presence of the other 
UPF proteins or the EJC. In this isolated complex, the kinase activity of 
SMGl is repressed by the proteins SMG8 and SMG9 (see Section 3.2.9) 
(Yamashita et ah, 2009). It is only when SMGl and UPFl are recruited to 
the EJC and to the translation termination complex (ribosome, eRFl, and 
eRF3) that SMGl phosphorylates UPFl, once UPF2 interacts with the C- 
terminal part of SMGl (Kashima et ah, 2006). 

Besides its activity in NMD, SMGl as a PI3 kinase related kinase family 
member is also involved in other related processings, such as DNA dam¬ 
age response or telomere maintenance. Indeed, SMGl is activated when 
cells are exposed to genotoxic stresses, leading to the phosphorylation of 
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Figure 1.17 Schematic representation of SMGl and its functional domain. As member 
of the PI3K family, SMGl possesses a Huntington, elongation factor 3, a subunit of PP2A, 
and TORI (HEAT) repeat domain at the N-terminal part. The HEAT domain is followed by a 
conserved ~600 amino acids FRAP/TOR, ATM, andTRRAP (FAT) domain and a ~ 100 amino 
acids FKBP12-rapamycin-binding (FRB) domain involved in the binding with UPF2 (Melero 
et al., 2014). Then, the catalytic kinase domain (PIKK) of about 340 amino acids is indicated 
in yellow, followed by a poorly characterized insertion domain of more than 1000 amino 
acids (insertion). Finally, at the C-terminal end of SMGl is the FAT C-terminal (FATC) do¬ 
main required for the kinase activity of SMGl and for protein-protein interaction (Lempi- 
ainen and Halazonetis, 2009). The amino acid position is indicated at the top. 
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downstream targets, such as the serine 15 of P53 (Brumbaugh et ah, 2004; 
Gewandter et ah, 201 1). 

In addition, SMGl has been involved in another mechanism requir¬ 
ing UPFl called Staufen-mediated mRNA decay (SMD), which is a de¬ 
cay pathway of specific mRNAs, recognized by the Staufen protein via a 
secondary structure located in the 3' UTR of these mRNAs (Park and 
Maquat, 2013). Indeed, downregulation of SMGl impairs SMD and over¬ 
expression of the nonfunctional version of SMGl also inhibits SMD (Cho 
et al., 2013). 

Recently, two cofactors of SMGl have been identified and called SMG8 
and SMG9 (see Section 3.2.9) (Yamashita et ah, 2009). Both proteins are 
present in the SURF complex and interact with SMGl in order to regulate 
the activity of SMGl, together with some additional proteins, called RuvB- 
like AAA ATPasel (RUVBLl) and RuvB-like AAA ATPase2 (RUVBL2) 
(Izumi et ah, 2010). It is expected that the activity of SMGl is tightly regu¬ 
lated, since the phosphorylation of UPFl is the key event that will precipi¬ 
tate an niRNA to a fast decay. 

3.2.6 SMG5/EST1B 

The SMG5 gene is located on the chromosome 1, and encodes a protein 
of about 114 kDa (Fig. 1.18). SMG5 is found in the cytoplasm, as well as 
in the nucleus (Durand et ah, 2007; Unterholzner and Izaurralde, 2004). 
In the cytoplasm, a fraction of SMG5 localizes into the processing bodies 
(P bodies) (Note 1.1). SMG5 harbors a nonfunctional Pilt N-terminus 
(PIN) domain at its C-terminal end, which is a domain found in single 
strand RNases (Glavan et ah, 2006) and a tetratricopeptide repeat (TPR) 
domain at the N-terminal end of the protein. The TPR domains and, 
more particularly, the 14-3-3 domain of SMG5 and SMG7, interact to 
form a heterodimer SMG5—SMG7 that will bind phosphorylated UPFl 
(Chakrabarti et ah, 2014). SMG5 functions as a component of the de¬ 
phosphorylation complex of UPFl during the NMD process (Ohnishi 
et ah, 2003). 

The role of SMG5 in NMD is to interact with UPFl, together with the 
protein phosphatase 2A, SMG7, and SMG6, in order to promote the dephos¬ 
phorylation of UPFl (Ohnishi et al., 2003). Besides its involvement in NMD, 
SMG5 and SMG6 have been shown to interact with the human telomer- 
ase reverse transcriptase (hTERT) and to play a role in the maintenance of 
the length of telomeres (Reichenbach et al., 2003; Snow et al., 2003). This 
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Figure 1.18 Schematic representation of SMG5 and its functional domains. At the N- 
terminal end of the protein is found the tetratricopeptide region formed by a 14-3-3 
domain and a helical hairpins domain interrupted by an insertion of 220 amino acids. 
At the C-terminal part of the protein is the PIN domain. The amino acid position is indi¬ 
cated at the top. 


NOTE 1.1 The Processing Bodies (P-bodies) 

P-bodies have been described for the first time by following the cellular localiza¬ 
tion of the 5'-3' exoribonuclease Xrni (Bashkirov et al., 1997). They were then 
characterized as cytoplasmic foci, containing proteins involved in RNA decay, 
such as Lsm proteins, the decapping enzymes, Staufen, argonaute, and proteins 
involved in NMD (Anderson and Kedersha, 2006; Cougot et al., 2004; Durand 
et al., 2007; Ingelfinger et al., 2002; Liu et al., 2005; Sheth and Parker, 2003; Unter- 
holzner and Izaurralde, 2004; van Dijk et al., 2002). Some RNAs have been shown 
to be present in P-bodies such as miRNAs (Pillai et al., 2005) or PTC-containing 
mRNAs (Durand et al., 2007). Although ribosomal proteins are not found in P-bod- 
ies, indicating that translation does not occur in P-bodies, some proteins involved 
in translation have been found in these cytoplasmic foci, such as elF4E (Anderson 
and Kedersha, 2006; Kedersha et al., 2005). P-bodies are dynamic structures, but 
cannot be considered as an organelle, since they are not limited by a membrane, 
and their function is still not clear, at least in mammals, unlike in yeast, where it 
has been demonstrated that P-bodies are the place where RNAs are degraded 
(Sheth and Parker, 2003). In mammals, P-bodies could be either a storage place 
for RNAs and decay enzymes or the site where RNA decay takes place. 


role involves the fraction of SMG5 present in the nucleus and dedicated to 
a process independent of NMD, even though several NMD factors, such as 
UPFl, UPF2, or SMGl, take part in the maintenance of the integrity of the 
genome and the length of telomeres (Azzalin and Lingner, 2006a,b; AzzaHn 
et al., 2007; Brumbaugh et al., 2004). 
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Figure 1.19 Schematic representation of SMG6 and its functional domains. At the N- 
terminal end of the protein, two EJC binding domains (EBMs) have been identified. The 
TPR domain is located in a central to C-terminal region of the protein and is formed by a 
14-3-3 motif and a helical hairpins domain. The functional PIN domain is situated at the 
C-terminal end. The amino acid position is indicated at the top. 


3.2.7 SMG6/EST1A/hSmg5/7a 

The SMG6 gene is located on the chromosome 17 and encodes a protein 
of about 160 kDa. SMG6 localizes almost exclusively in the cytoplasm, 
where it concentrates into P-bodies (Durand et ah, 2007; Unterholzner and 
Izaurralde, 2004). Several functional domains have been found in the SMG6 
protein (Fig. 1.1 9). Just like for SMG5, a PIN domain is present at the C- 
terminal part of the protein but, unlike SMG5, this domain is functional 
in SMG6, since SMG6 has an RNase activity (Huntzinger et ah, 2008). 
Indeed, SMG6 has been shown to have an endonuclease activity that is 
thought to cleave RNA in the vicinity of the PTC (Eberle et ah, 2009; 
Huntzinger et ah, 2008; Mascarenhas et ah, 2013). In the middle of the C- 
terminal region of the protein, the TPR domain, and in particular a 14-3-3 
like domain, serves to the interaction with phosphorylated UPFl (Fuku- 
hara et ah, 2005). Finally, the N-terminal end of the protein interacts with 
the EJC via two EJC binding motifs (EBM) (Kashima et ah, 2010). SMG6 
also interacts with UPFl in a phosphorylated-independent way, via the stalk 
and the SQ domains of UPFl (Nicholson et ah, 2014). 

In the NMD process, SMG6 binds to UPFl, once UPFl has been 
loaded on EJC and phosphorylated by SMGl. It is at that step that SMG6, 
via its endonucleolytic activity, cuts the niRNA in the vicinity of the 
PTC to generate two fragments with an unprotected 5'’ or a 3' end (Eb¬ 
erle et ah, 2009; Huntzinger et ah, 2008; Mascarenhas et ah, 2013). These 
fragments are then quickly degraded by the exonucleolytic pathways, in¬ 
volving the exosome for the decay from the 3' end to the 5' end, and 
the exoribonucleases XRNl or XRN2 for the decay from the 5'’ to the 
3' end (Lejeune et ah, 2003). Interestingly, SMG6 shows a preference in 
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its cleavage sequence, as shown by analysis of hundreds of endogenous 
substrates for SMG6 cleavage (Schmidt et ah, 2014b). The degenerated 
consensus sequence for the favored cleavage substrates of SMG6 is (U/A) 
(G/A) I (C/A)N(C/U) (where N is for any ribonucleotides and J symbol¬ 
izes the cleavage site of SMG6) located close to the stop codon. Impor¬ 
tantly, this study was performed on natural substrates of NMD, in which 
the stop codon activating NMD and its nucleotide environment have 
been selected during evolution to be efficiently recognized and cleaved 
by SMG6. Indeed, changing two nucleotides of a SMG6 cleavage site into 
two nucleotides that do not fit the consensus sequence strongly inhibits 
SMG6 cleavage. It will be essential to understand how SMG6 cleaves in 
the vicinity of PTCs for which the nucleotide environment likely differs 
from the consensus sequence. 

Besides its role in NMD, SMG6 is also involved in the maintenance of 
the length of telomeres, as described in the SMG5 section. 

3.2.8 SMG7/EST1C 

The SMG7 gene is carried by the chromosome 1 and encodes a protein 
of about 127 kDa that localizes in both nucleus and cytoplasm, where it 
concentrates into P-bodies (Durand et ah, 2007; Unterholzner and Iza- 
urralde, 2004). Several domains have been identified in SMG7 (Fig. 1.20). 
In particular, a TPR domain is present at the N-terminal end of the pro¬ 
tein, formed by a 14-3-3 domain responsible for the heterodimerization 
with SMG5 and a helical hairpins domain. The TPR domain is followed 
by a linker region and a proline-rich domain, called PC.The PC domain is 
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Figure 1.20 Schematic representation of SM07 and its functional domains. The TPR 
formed by a 14-3-3 domain and a helical hairpins domain is located at the N-terminal 
part of the protein followed by a linker region and a C-terminal proline-rich (named PC) 
region. The amino acid position is indicated at the top. 
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responsible for the mRNA decay activity of SMG7, by stimulating decap¬ 
ping and deadenylation. In particular, the PC domain of SMG7 was shown 
to interact directly with the protein POP2, which is a catalytic subunit of 
the CCR4-NOT deadenylase complex (Loh et ah, 2013). 

SMG7, together with SMG5, interacts with phosphorylated UPFl, via 
its 14-3-3 like domain. This interaction occurs when UPFl has been re¬ 
cruited by the EJC and has been phosphorylated by SMGl. Interestingly, 
SMG7 is capable to address phosphorylated UPFl to P-bodies, indicating 
that a step of NMD occurs in these cytoplasmic foci (Durand et ah, 2007; 
Fukuhara et ah, 2005; Unterholzner and Izaurralde, 2004). 

3.2.9 SMGS/Amplified in Breast Cancer Gene 2 and SMG9 

SMG8 and SMG9 were identified after mass fingerprinting analysis of pep¬ 
tides coimmunoprecipitatingwith SMGl (Yamashitaet al.,2009). TheSMG8 
gene is found on chromosome 17 and encodes a protein of about 110 kDa. 
The SMG9 gene is carried by the chromosome 19 and encodes a protein 
of about 60 kDa. Some functional domains have been identified on both 
proteins, but these proteins have not been deeply characterized (Fig. 1.21). 

SMG8 and SMG9 interact tightly with SMGl and are thought to repress 
the kinase activity of SMGl. Interestingly, SMG8 interacts with SMGl only 
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Figure 1.21 Schematic representation ofSMGS (upper panel) and SMG9 (lower panel) 
and their functional domains. SMG8 is composed by two conserved regions named CRl 
and CR2. SI\/1G9 has a putative central nucleoside triphosphatase (NTPase) domain. The 
amino acid position is indicated at the top. 
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in the presence of SMG9 and is involved in the recruitment of the SMGl 
complex bound to UPFl by the EJC and the ribosome stalled at the PTC. 
The regulation activity of SMG8 and SMG9 on SMGl is completed by ad¬ 
ditional proteins, named RUVBLl and RUVBL2 (Izumi et ah, 2010). 

The involvement of all these NMD factors is universally recognized 
and admitted, but the identification of a nonsense codon as a premature 
termination is stiU subject to debate in mammalian cells and two models 
are proposed (Sections 3.3 and 3.4). Each one has strong experimental ar¬ 
guments, but none of them can explain all the described PTC recognitions. 
Based on both models, a third one starts to emerge and could be closer to 
the truth. 

3.3 EJC-Dependent Model 

In the 1990s, the relative position of introns versus stop codon appears to 
influence the activation of NMD (Carter et ah, 1996; Cheng et ah, 1994; 
Sun and Maquat, 2000; Sun et ah, 2000; Thermann et ah, 1998; Zhang 
et ah, 1998b). Indeed, at least one intron has to be present downstream of 
the PTC in order to elicit NMD, and with a minimal distance of 50—55 
nucleotides (Zhang et ah, 1998a). Interestingly, physiological stop codons 
are located in the last exon, except in few cases where the stop codon is 
carried by an other exon and at less than 50 nucleotides upstream of the 
last exon—exon junction for 98% of these exceptions; (Hawkins, 1988; Nagy 
and Maquat, 1998). It appears quickly that the intron was not the element 
inducing NMD by itself, but the splicing of the intron. Indeed, a protein 
complex has been identified at 20-24 nucleotides upstream of the majority 
of exon—exon junctions (Sauliere et ah, 2012; Singh et ah, 2012). This com¬ 
plex called EJC (exon junction complex) is deposited as a mark in order to 
signal where the splicing reaction occurred (Le Hir et ah, 2000a,b). Some 
of its components have been detected on the RNA, as early as concomi¬ 
tant to the spliceosome C complex (Fig. 1.6) (Ideue et ah, 2007; Reichert 
et ah, 2002; Schmidt et ah, 2014a; Singh et ah, 2012;Wahl et ah, 2009; Zhang 
and Krainer, 2007) . 

The EJC is a multiprotein complex of about 335 kDa composed by 
about 17 proteins (Table 1.4). Among these proteins are found splicing fac¬ 
tors (SRml60, RNPSl, SAP18, Pinin, Pnn/DRS, UAP56, MLN51), pro¬ 
teins involved in niRNA export (UAP56, Magoh,Y 14,TAP, and Aly/REF), 
and some core EJC proteins (Y14, MAGOH, eIF4AIII, MLN51). Due to 
its protein composition, EJC has been found to be involved in pre-mRNA 
splicing, niRNA export, translation, and NMD (Andersen et ah, 2006; 
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Table 1.4 Composition of the EJC 
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EJC's proteins Function References 


RNPSl 


Pre-mRNA splicing 

Tange et al. (2005); 

SAP18 

PSAP Complex 

coactivator 

Murachelli et al. 

Acinus 

Pinin 

Pre-mRNA 
splicing factor 
Transcriptional 
control factor 
Splicing coactivator 

(2012) 

Pnn/DRS 

Pre-mRNA 
splicing factor 

Li et al. (2003) 

SRml60 

Pre-mRNA 
splicing factor 

Le Hir et al. (2001) 

UAP56 


Pre-mRNA splicing 
and inRNA export 
factor 

Gatfield et al. (2001); 
Luo et al. (2001) 

Brentsz/MLN S1 

Pre-mRNA 
splicing factor 

Degot et al. (2004); 
Andersen et al. 

elF4AIII 


Translation enhancer 

(2006); Bono et al. 

Magoh 


mRNA export factor 

(2004);Jackson 

Y14 


mRNA export factor 

et al. (2010) 

REF/Aly 

mRNA export factor 

Le Hir et al. (2001); 
Gatfield and 
Izaurralde (2002) 

TAP-pl5 

mRNA export factor 

Strasser and Hurt 
(2000); Lejeune 
et al. (2002) 

UPF2 


Nonsense-mediated 
mRNA decay factor 

Le Hir et al. (2001) 

UPF3/3X 

Nonsense-mediated 
mRNA decay factor 

Kim et al. (2001); 
Gehring et al. (2003) 

PYM 


EJC disassembly factor 

Bono et al. (2004) 

SKAR 


Translation 

preinitiation factor 

Ma et al. (2008) 


Bono et al., 2004; Bono and Gehring, 2011;Degot et al., 2004; Gatfield and 
Izaurralde, 2002; Gatfield et al., 2001; Gehring et al., 2003, 2009; Jackson 
et al., 2010; Kim et al., 2001; Le Hir et al., 2001; Lejeune et al., 2002; Li 
et al., 2003; Luo et al.,2001;Ma et al., 2008; Murachelli et al., 2012; Strasser 
and Hurt, 2000;Tange et al., 2005). 

In NMD, the role of the EJC is thought to be the recruitment of NMD 
factors UPF3 or UPF3X, and then UPF2, aU considered to be components 
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of the EJC. Then, during the pioneer round or first round of transla¬ 
tion, ribosomes remove EJCs, while they are reading the ORE (Ishigaki 
et ah, 2001). On a wild-type mRNA, no EJCs will remain on the mRNA, 
when the ribosome reaches the physiological stop codon, which is the first 
stop codon met by the ribosome. Indeed, physiological stop codons local¬ 
ize either in the last exon, or at less than 50—55 nucleotides upstream of 
the last exon—exon junction (Hawkins, 1988; Nagy and Maquat, 1998). In 
the case of a PTC-containing mRNA, the first stop codon met by the 
ribosome is the PTC. It has to be located more than 50—55 nucleotides 
upstream of an exon-exon junction, that is, about 30 nucleotides upstream 
of an EJC, in order to be recognized as a PTC (Fig. 1.22). The minimum 
distance between the PTC and a downstream EJC that prevents the removal 
of the EJC by the ribosome seems to be about 30 nucleotides. Therefore, 
if the PTC is situated at less than 50—55 nucleotides upstream of the last 
exon—exon junction, the ribosome will remove the last EJC on the mRNA, 
before reaching, or when it reaches, the PTC. This PTC will not elicit 
NMD and the PTC-containing mRNA will be immune to NMD and will 
promote the synthesis of a truncated protein. 

The sequential events leading to the recognition of a PTC and 
the decay of the mRNA carrying this PTC are as follows (Fig. 1.23). 
First, the EJC is deposited in the nucleus 20-24 nucleotides upstream 
of most of the exon—exon junctions, as a consequence of splicing (Le 
Hir et ah, 2000a,b; Sauliere et ah, 2012; Singh et ah, 2012). The EJC 
recruits UPF3 or UPF3X in the nucleus, before the export to the cyto¬ 
plasm of the niRNP (Lejeune et ah, 2002). UPF2 is then recruited by 
UPF3 or UPF3X before the first/pioneer round of translation (Ishigaki 
et ah, 2001). It is during the pioneer round of translation that PTCs 
are recognized. Indeed, the ribosome reads the mRNA and translates 
it into a peptide (Apcher et ah, 2011). The ribosome removes all EJCs 
that it meets until the first stop codon. If the stop codon is in the last 
exon or at less than 50—55 nucleotides upstream of the last exon—exon 
junction, the mRNA is not subject to NMD. When the first stop codon 
met by the ribosome is a PTC, at least one EJC remains downstream 
of the PTC, since the ribosome pauses at the first stop codon. UPFl, 
SMGl, SMG8, and SMG9 are recruited as a complex with the release 
factors 1 and 3 (eRFl and eRF3) to the ribosome stalled at the PTC, 
in order to form the SURF complex (Kashima et ah, 2006). SMG8 and 
CBP80 favor the interaction between UPFl and UPF2, in order to form 
the DECID complex (Hosoda et ah, 2005; Yamashita et ah, 2009). At 
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Translation 


EJC 




Figure 1.22 The position of the first stop codon determines the fate of the mRNA. For a 
wild-type mRNA (upper panel) during the pioneer round of translation, the ribosome 
translates the ORF and removes all the EJCs that it meets, until it reaches the first stop 
codon which is the physiological stop codon located in the last exon. No EJCs will re¬ 
main on the mRNA, making this mRNA immune to NMD. In the case of a PTC contain¬ 
ing mRNA (lower panel), the ribosome translates the ORF during the pioneer round of 
translation until it reaches the first stop codon, which is the PTC. The PTC is located at 
more than 50-55 nucleotides upstream of an exon-exon junction, making the distance 
between the ribosome pausing on the PTC and the downstream EJC sufficient to main¬ 
tain EJC on the mRNA. This mRNA will then be subject to NMD. 

that moment, UPF2 interacts with SMGl, releasing the kinase activ¬ 
ity of SMGl that phosphorylates UPFl. UPF2 and UPF3 or UPF3X 
also interact with UPFl to free its 5'’ to 3' helicase activity. Phosphor- 
ylated UPFl induces the leaving of the release factors 1, the release 
factor 3, and the ribosome. In parallel, phosphorylated UPFl recruits 
the dephosphorylating complex formed by the heterodimer SMG5/ 
SMG7 (Jonas et ah, 2013; Ohnishi et ah, 2003), SMG6 and the protein 
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Figure 1.23 Model of NMD activation dependent of the EJC (see text). 
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phosphatase 2A leading to the concomitant release of SMGl, SMG8, 
and SMG9. SMG7 is thought to be an adaptor targeting UPFl and the 
PTC-containing mRNP to the P-bodies (Durand et ah, 2007; Fukuhara 
et ah, 2005; Unterholzner and Izaurralde, 2004). UPFl is dephosphory- 
lated by the protein phosphatase 2A and SMG6 induces an endonu¬ 
cleolytic cleavage in the vicinity of the PTC (Huntzinger et ah, 2008; 
Mascarenhas et ah, 2013; Schmidt et ah, 2014b). UPFl, thanks to its 
helicase activity, removes proteins bound on the downstream fragment 
making it a substrate free of proteins and ready for decay by exoribo- 
nucleases, such as XRNl or XRN2. The decapping activity and the 
deadenylation activity are also activated in order to release the mRNA 
ends from their protective proteins, and make them sensitive to the 
exoribonucleolytic decay (Cho et ah, 2009; Lejeune et ah, 2003). PTC- 
containing mRNAs are then degraded by different decay pathways, lead¬ 
ing to an efficient reduction of the amount of PTC-containing mRNAs. 

The involvement of EJC in NMD has been demonstrated by tether¬ 
ing some components of the EJC downstream of a stop codon in order 
to mimic the anchor of an EJC and by measuring the level of the corre¬ 
sponding mRNA (Gehring et ah, 2008; Lykke-Andersen et ah, 2000; Pa¬ 
lacios et ah, 2004). Results show that tethering Y14, MAGOH, RNPSl, 
or eIF4Ain downstream of the physiological stop codon of the (d-globin 
mRNA, leads to a strong decrease of the level of the ^-globin mRNA 
(Gehring et ah, 2005). Consistent with the involvement of EJC in NMD, 
downregulating Y14 using siRNA impairs significantly NMD (Gehring 
et ah, 2003). 

An additional argument to support the EJC-dependent model comes 
from natural substrates of NMD that are wild-type genes using NMD to 
regulate their own expression (Mendell et ah,2004) (see Section 3.5). Some 
of them activate a splicing in their 3'’ UTR to promote the deposit of 
an EJC downstream of the physiological stop codon and activate NMD. 
For example, the splicing factor SRSF2 (SC35) regulates its expression via 
NMD (Sureau et ah, 2001). When the level of SRSF2 protein is abnormally 
high in a cell, this splicing factor activates several cryptic splicing events on 
the 'b' UTR of its own pre-mRNA. Those splicing events do not occur 
when the SRSF2 protein is not too abundant in the cell. The consequences 
of these splicing events are the presence of several EJCs downstream of the 
physiological stop codon. During the pioneer round of translation, this stop 
codon win then be recognized as a PTC, and will induce NMD on the 
SRSF2 mRNA, leading to the decrease of the level of SRSF2 protein, in 
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order to bring it back to the physiological level. Overall, these studies dem¬ 
onstrate that EJC plays an essential role in NMD. 

The 50-55 nucleotides rule applies to many cases of PTCs, but not to 
aU. Examples have been described, in which EJCs are present downstream 
of a PTC, and should induce NMD but do not (Knezevic et ah, 1995; Nagy 
and Maquat, 1998). In some other situations, PTCs located at less than 
50-55 nucleotides upstream of the last exon—exon junction are able to elicit 
NMD. For instance, a nonsense mutation at 15 nucleotides upstream of the 
last exon—exon junction oftheTCRfB gene elicits NMD, when clearly this 
PTC should not be able to induce it according to the 50-55 nucleotides rule 
(Wang et ah, 2002). Several hypotheses could explain that exception, start¬ 
ing by the presence of an EJC downstream of the last exon—exon junction 
that would be deposited according to an unusual way, independently of a 
splicing event. Another hypothesis would be that NMD can be activated 
independently of the presence of EJCs (see Section 3.5). In another cases, 
some PTCs close to the translation initiation codon and situated at more 
than 50—55 nucleotides upstream of an exon—exon junction fail to elicit 
NMD as it has been reported for some PTCs in HNF-lbeta mRNA for 
instance (Harries et ah, 2005). The hypothesis to explain such exception to 
the 50-55 nucleotides rule is a possible translation reinitiation, downstream 
of the PTC. Translation reinitiation can be thought when PTC is close to 
the original translation initiation codon and if another translation initiation 
codon is available in the close vicinity, downstream of the PTC. Such event 
wiU result in the synthesis of a truncated protein lacking its N-terminal end. 

Another argument challenges the EJC-dependent model of NMD ac¬ 
tivation, based on two recent studies showing that only a fraction of the 
exon—exon junctions carries an EJC at 20—24 nucleotides upstream of the 
splicing event (Sauliere et ah, 2012; Singh et ah, 2012). These studies sug¬ 
gest that the systematic deposit of an EJC after a spicing event is not strictly 
occuring and the EJC might not be a reliable mark to detect PTCs. Other 
studies present evidence that the signal eliciting NMD is the distance be¬ 
tween the PABPCl and the stop codon, rather than the presence of an EJC 
downstream of a PTC (see Section 3.4). 

3.4 Model Involving the Distance Between the Stop Codon 
and the Position of the poly(A) Binding Protein Cl 

In yeast, C. elegans, or Drosophila, the EJC does not exist or does not play a 
role in NMD (Conti and Izaurralde, 2005). In these species, the size of the 
3' UTR is relatively more homogenous than in mammals. Because of this 
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homogeneity in the size of the 3' UTR in yeast, C. elegans, or Drosophila, it 
is possible that a mechanism measures it and detects when it is abnormally 
long to elicit NMD. In mammals, a similar model has been proposed based 
on the demonstration that a PTC does not activate NMD if the PABPCl is 
artificially tethered to a close downstream area of the PTC (Behm-Ansmant 
et ah, 2007; Eberle et ah, 2008; Ivanov et ah, 2008; Silva et al, 2008; Singh 
et al., 2008). The size of the 3' UTR in mammals is much more heteroge¬ 
neous than in the other species, making it difficult for the moment to un¬ 
derstand which markers are used to measure the distance between the first 
stop codon of the ORF and the position of the PABPCl, and how NMD 
is activated in this model that does not require the EJC.The model suggests 
that there occurs a competition between UPFl and the PABPCl to interact 
with the release factor 3, and when the PABPCl is close to the stop codon, 
UPFl has no chance to make an interaction with eRF3 and to activate NMD 
(Fig. 1.24). In contrast, when the 3' UTR is long, the distance between the 
stop codon and the PABPCl favors the interaction between the release factor 
3 and UPFl rather than with PABPCl and the activation of NMD. 

However, this model has recently been challenged by the demonstra¬ 
tion that a mutant PABPCl unable to interact with eRF3 is stiU capable 
of inhibiting NMD when it is tethered close to a PTC. The interaction 
between PABPCl and the translation initiation factor eIF4G promotes the 
circularization of the mRNA, which is crucial to the promotion of NMD 
on PTC-containing niRNA, independently of the interaction between 
PABPCl and the eRF3 (Fatscher et ah, 2014). 

Supporting the model of the length of the 3' UTR, the demonstration 
has been made that the introduction of an intron downstream of a normal 
stop codon is not sufficient to elicit NMD (Singh et ah, 2008). Another 
supporting argument comes from the discovery that UPFl concentrates 
in the 3' UTR of mRNAs without the requirement of an EJC (Hogg 
and Goff, 2010; Kurosaki et al., 2014; Kurosaki and Maquat, 2013; Zund 
et ah, 2013). UPFl and in particular phosphorylated UPFl would be the 
marker to differentiate between a normal and an abnormally long 3' UTR 
according to the number of phosphorylated UPFl present downstream of 
the stop codon. Knowing that tethering UPFl downstream of a normal 
termination codon elicits NMD without recruitment of UPF2, UPF3/ 
UPF3X, and the EJC (Lykke-Andersen et ah, 2000), this suggests that the 
EJC might not be absolutely necessary for NMD, as long as UPFl finds 
a way to anchor downstream of the stop codon. Consistent with the non 
requirement of the presence of EJCs to elicit NMD, two studies showed 
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Figure 1.24 Model for NMD activation involving the distance between the PABPCi and 
the stop codon. For mRNAs immune to NMD (A), the distance between the first stop 
codon and the PABPCI is not recognized as long. The competition between UPFl and 
PABPCI for the interaction with eRF3 is in favor of the PABPCI proteins. PABPCI inter¬ 
acts with eiF4G and the eRF3 stimulates the recycling of the ribosome for new rounds of 
translation. For mRNAs subject to NMD 
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Figure 1.24 (cont.) (B), the distance between the first stop codon and the PABPCl 
is recognized as long. eRF3 has more chance then to interact with UPFl than with 
PABPCl. UPFl is then phosphorylated by SMGl and SMG5/SMG7 and/or SMG6 are re¬ 
cruited to dephosphorylate UPFl, via the protein phosphatase 2A (PP2A), inducing the 
decay of the mRNA by exo- and/or endonucleolytic cleavage. 
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that NMD can occur on eIF4E-bound PTC-containing niRNAs mean¬ 
ing after the pioneer round of translation when EJCs are not present any¬ 
more on the mRNA (Durand and Lykke-Andersen, 2013; Rufener and 
Miihlemann, 2013). 

However, this model is stiU uncompleted, since the length of the 'h' UTR 
is highly variable in mammals, from 21 nucleotides to more than 8.5 kb 
(Mignone et ah, 2002). Indeed, more than 30% of niRNAs have a 'i' UTR 
exceeding 1000 nucleotides (Pesole et ah, 2000), and these mRNA with long 
3' UTR are expressed. Paradoxically, from 800 nucleotides, a 3' UTR should 
be considered as a long 3' UTR, since this length has been shown to be suf¬ 
ficient to elicit NMD on a wild-type mRNA (Eberle et ah, 2008; Hogg and 
Goff, 2010; Rebbapragada and Lykke-Andersen, 2009; Singh et al., 2008; 
Yepiskoposyan et al., 201 1). Until now, two mechanisms have been proposed 
to explain how natural long 3'’ UTR can escape from NMD. The first one is 
a structuration of the 'i' UTR that would bring closer PABPCl proteins to 
the normal stop codon (Eberle et al., 2008) and the other one is the presence 
of cis elements, such as A/U rich sequence, in the 200 nucleotides down¬ 
stream of the normal stop codon (Toma et al., 2015). This sequence would 
bind one or several factors that would promote the normal translation ter¬ 
mination and inhibit NMD. These factors have not yet been identified, and 
this mechanism is not universal, since an A/U rich sequence is not present in 
aU long natural 'i' UTR, raising some unanswered questions for this model. 

The sequential steps leading to the activation of NMD have been well 
studied in the case of the EJC model, unlike that for the length of the 'i' 
UTR model, for which further investigations will be necessary in order to 
identify the proteins involved in this NMD activation pathway. However, 
UPFl and SMGl have been shown to be part of both NMD activation 
pathways, unlike SMG5, SMG6, or SMG7, that seem to be not essential for 
the second NMD activation pathway (Metze et al., 2013). Many steps are 
missing for this second pathway, and in particular whether UPFl has to be 
phosphorylated and dephosphorylated in order to induce NMD and, if it 
is the case, by which mechanism, since SMG5, SMG6, and SMG7 do not 
seem to be as involved as in the EJC-dependent model (Metze et al., 2013). 

Both models to elicit NMD, the EJC-dependent and the distance PAB- 
PCl-PTC models, have strong experimental arguments to support their exis¬ 
tence. Rather than seeing those two models as exclusive models, there might 
be a way to reconcile both, and to merge toward a unique activating NMD 
model that would explain the wide situations of niRNAs eliciting NMD. The 
idea would be that both models could coexist, and the EJC would be an 
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activator/potentiator of NMD, but would not be absolutely required (Buhler 
et al., 2006; Metze et al., 2013). Indeed, as for the other organisms, the length 
of the 3' UTR would be the determinant to elicit NMD in mammalian cells. 
During evolution, the EJC has been integrated in the NMD process likely to 
optimize the recognition of PTCs. For instance, except for theXCRP gene 
(Wang et al., 2002), NMD efficiency is generally not dependent of the location 
of the PTC in the ORF in mammals (Cheng et al., 1994; Zhang et al., 1998b), 
unlike in yeast, where a polar effect is clearly observed with a high NMD 
efficiency for the PTCs situated at the most distal position from the 3' end 
of the mRNA (Leeds et al., 1991; Losson and Lacroute, 1979; Pelsy and La- 
croute, 1984). Since mRNAs can be extremely long in mammals, the recogni¬ 
tion of PTCs located at the beginning of the ORF in particular need some 
beacons likely to bring the NMD machinery at a reasonable distance from any 
putative PTCs.Then, the recognition of the PTC is achieved via EJCs, if there 
are some present downstream of the PTC, according to the EJC model de¬ 
scribed previously, or according to the model involving the distance between 
the PABPCl and the stop codon, when no EJCs are present downstream of 
the PTC. The recognition of PTCs via the length of the distance between the 
PTC and the PABPCl could represent an ancestral mechanism conserved 
during evolution, stiU active in mammals, and optimized by the PTC recog¬ 
nition EJC-dependent. The resulting NMD activation model would be that 
PTCs are recognized first during the pioneer round of translation according to 
the EJC-dependent model and the fraction of PTC-containing mRNAs that 
would escape from this first recognition would be subject to a second analysis 
via the distance PABPCl-PTC model. 

Although the merge model can explain most of NMD activations, some 
specific cases stiU resist and cannot be explained by that model. For ex¬ 
ample, PTCs in the last exon often are immune to NMD like for P globin 
gene in which PTCs in the last exon lead to a dominant negative form 
of P-thalassemia, by producing truncated nonfunctional P globin chains 
(Thein et al., 1990). According to the EJC dependent model, PTCs in the 
last exon are not subject to NMD and generate truncated proteins which 
is consistent with what is observed in this situation. It is more difficult to 
understand under the merge model law since the absence of EJC down¬ 
stream to the PTC should be compensate by the detection of an increased 
length of the 3' UTR, compared to the WT niRNA which is not what is 
observed. Another particular example of PTCs that do not fit the activation 
of NMD by any existing models comes from PTCs in the COLlOAl gene 
(Fang et al., 2013;Tan et al., 2008). This gene harbors three exons, with the 


50 


Nonsense Mutation Correction in Human Diseases 


last exon that represents the majority of the ORE PTCs in this exon elicit 
NMD, when located at the 3' end of the exon and do not activate NMD 
when located in the rest of the exon 3 (Fig. 1.25). According to the EJC 
dependent model, none of these PTCs should activate NMD and the most 
distant PTCs from the 3' UTR should be the ones that elicit NMD, accord¬ 
ing to the model involving the distance between the first stop codon and 
the PABPCl or the merge model. To explain that situation, an additional 
model has been suggested, in which ribosomes translating the ORF, from 
the translation initiation codon until the PTC, protect that region of the 
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Figure 1.25 NMD activation in the last exon (the exon 3) of the mRNA, encoding the 
CollOal protein. On this mRNA, PTCs in the last exon elicit NMD when they are at the 
3' end of the exon 3, but not when they are located in the rest of the exon (upper part). 
For the PTC-PABPCl distance model (or the merge model), if PTCs at the end of the 
exon elicit NMD, upstream PTCs should also activate NMD, making this model irrelevant 
for this mRNA. For the EJC-dependent model, only PTCs located at more than 50-55 
nucleotides upstream of the exon 3 should elicit NMD, which is not the case here, mak¬ 
ing this model not suitable for this mRNA. Another proposed model is about the ribo¬ 
some release model but, according to that model, upstream PTCs should also activate 
NMD (see text), also making this model not capable to explain fully the NMD activation 
on this model. 
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niRNA from being degraded by RNases, unlike the mRNA region from 
the PTC to the 1> end that is not protected by ribosomes and, therefore, 
accessible to RNases (Brogna and Wen, 2009). Even that model does not 
fit the experimental data 100% because not aU PTCs in the exon 3 induce 
NMD. It is always possible to hypothesize about a secondary structure that 
would bring the PABPCl close in the case of the upper PTC, and not for 
the downstream PTC because, for these latter, ribosome would read most 
of this exon and would destabilize the secondary structure, allowing NMD 
to be activated. Such exceptions need to be further studied, in order to 
understand why they do not fit the majority of the cases, and then allow 
improving the existing merge model. 

3.5 Natural Substrates of NMD 

The role of NMD in eliminating niRNAs harboring a PTC due to a 
frameshift or nonsense mutation is essential for the cell, in order to prevent 
the synthesis of failure proteins. Another role of NMD was more recently 
highlighted and focuses on the regulation of some gene expression. This 
role was first identified after downregulating hUPFl or hUPF2 in the hu¬ 
man HeLa ceUs, using siRNAs (Mendell et ah, 2004). The transcriptomic 
analysis leads to the conclusion that about 5% of the human genome is 
upregulated when NMD is inactivated. It is possible to organize most of 
these genes into categories according to their mode of regulation involv¬ 
ing NMD (Fig. 1.26). Indeed, genes with an upstream open reading frame 
(uORF), harboring an intron in the 3'’ UTR, encoding a selenoprotein, 
subject to alternative splicing leading to the introduction of a PTC by in¬ 
tron retention or frameshift, and genes targeted by transposon elements that 
introduce a PTC or induce a frameshift, are natural substrates of NMD. 
According to external and/or internal parameters, these genes are expressed 
by escaping from NMD or they are repressed if they are subject to NMD. 

For example, in the case of mRNA harboring an uORF, the mRNA 
escapes from NMD, as long as only the major open reading frame (mORF) 
is translated, and the uORF is ignored by the translation machinery. When 
the uORF is translated, the stop codon of the uORF is recognized as a PTC 
and wiU activate NMD. The consequence is a decrease of the synthesis of 
the protein translated from the mORF. It is, therefore, an efficient regulation 
pathway related to the translation of the uORF. 

For mRNAs encoding a selenoprotein, a UGA codon is present in the 
ORF, and can be recognized by a transfer RNA carrying a serine trans¬ 
formed into a selenocysteine, in the presence of selenium. Such recognition 
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Figure 1.26 Categories of natural substrates of NMD. Exons are represented by a rect¬ 
angle and introns by a thick horizontal line. uAUG, uORF, mAUG, and mORF stand for the 
upstream translation initiation codon, upstream open reading frame, major translation 
initiation codon, and major open reading frame. STOP shows the position of physiologi¬ 
cal stop codons. 
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of a stop codon UGA to incorporate a selenocysteine is extremely con¬ 
trolled, and requires the presence of a cis element on the mRNA, called SeC 
insertion sequence (SECTS), and the presence of factors such as the SECTS 
binding protein 2 (SBP2) and the translation factor eEFsec.When selenium 
is present in the cellular environment, such UGA codon leads to the incor¬ 
poration of a selenocysteine. Tn contrast, when the cellular environment is 
poor in selenium, the UGA codon will be recognized as a PTC and will 
activate NMD in order to prevent the synthesis of selenoproteins. 

A major source of production of PTC is by alternative splicing. About 
35% of mRNA isoforms generated by alternative splicing harbor a PTC 
(Green et ah, 2003). The use of alternative splicing to incorporate a PTC 
and to promote the silencing of a gene is called regulated unproductive splic¬ 
ing and translation mechanism (RUST) (Lewis et al., 2003). This mechanism 
has been conserved during the evolution from yeast to human, highlighting 
its requirement for the homeostasis of gene expression (Lareau et al., 2007a). 
Many genes encoding splicing factors belong to the genes using RUST, and 
in particular the serine/arginine rich protein (SR protein) family (Lareau 
et ah, 2007b; Lewis et al., 2003; Sureau et al., 2001;'WoUerton et al., 2004). 
The members of this family regulate their expression by affecting the splic¬ 
ing events occurring on their own pre-mRNA, and in particular by incor¬ 
porating a poison cassette exon containing a PTC (Fig. 1.27) (Lareau and 
Brenner, 2015; Lareau et al., 2007b; Lejeune et ah, 2001; Sureau et al., 2001). 

Genes encoding NMD factors are also using NMD as a gene regula¬ 
tion pathway (Yepiskoposyan et al., 2011). Tdowever, genes encoding NMD 
factors used either an upstream of ORF or a long 3' UTR to activate 
NMD, suggesting a complex way to regulate the activation of NMD on 
these mRNAs, involving translation and likely secondary structure of the 
3' UTR. 

Regulation of the gene expression by NMD is universally admitted but 
the number of genes using NMD as a regulation pathway is likely less than 
5% of the human genome. Indeed, 5% of the human genome is upregulated 
after downregulation of UPFl or UPF2 by siRNAs (MendeU et al., 2004). 
Tdowever, about 5% of the human genome is also downregulated under the 
same experimental conditions, indicating that approach measured direct and 
indirect effect of an inhibition of NMD. Therefore, in the 5% of genes up- 
regulated after NMD inhibition, a proportion is upregulated because another 
gene product was affected by the NMD inhibition. Consistent with that, 
another study analyzed some of the genes that were found to be upregulated 
at the origin, when NMD is inhibited; it was also found that upregulation 
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Figure 1.27 Regulation of SR protein expression via alternative splicing and NMD. 
When the level of the SR protein is normal, the mRNA does not include the poison 
exon cassette containing the PTC.This mRNA will be translated to generate the SR pro¬ 
tein. When the level of the SR protein is abnormally high, the poison exon cassette is 
retained in the mRNA, introducing a PTC in the ORF. That mRNA will be degraded by 
NMD leading to an absence of the SR protein synthesis and a decrease in the level of 
that protein. 


already occurs at the pre-mRNA level, suggesting that is a transcription effect 
and not an effect via NMD (Viegas et ah, 2007). Indeed, it is difficult to dis¬ 
tinguish genes directiy regulated by NMD and genes whose expression will 
be affected because of the absence of NMD, due to the overexpression of a 
direct NMD substrate (Fig. 1.28). The amount of natural targets of NMD is 
therefore expected to be less than 5% of the human genome. 
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Figure 1.28 Impact of NMD inhibition on gene expression. When NMD is active, natural 
substrates of NMD are degraded by NMD. The product of these natural substrates of NMD 
can regulate the expression of other genes, like transcription factors. Those genes are not 
expressed when NMD degrade the mRNA encoding the transcription factor (left side), un¬ 
like when NMD is inhibited (right side). A global transcriptional analysis under NMD inhibi¬ 
tion will select natural substrates of NMD and genes with an expression dependent of the 
level of natural substrates of NMD, even though these last ones are not subject to NMD. 


3.6 Regulation 

NMD, like any other quality control mechanism, has to be tightly controlled 
in order to fiU its functions accurately. Though the molecular details of the 
mechanism have been extensively studied, the regulation of NMD begins to 
be investigated. This regulation can affect the entire NMD mechanism, and 
often reflects a cell status (cell differentiation or cell death, for instance) or 
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modulate the efFiciency of one or several specific NMD factors (phosphor¬ 
ylation, miRNA, or autoregulation). We are now going to explore different 
regulation pathways that have been described for NMD. 

3 . 6.7 Autoregulation 

As we just saw, NMD factors use NMD to regulate their own expressing 
(Yepiskoposyan et ah, 2011). Such autoregulation suggests that NMD effi¬ 
ciency has to be limited and, if it becomes too high or too low, the level of 
NMD factors will be adjusted in order to be maintained at the physiologic 
level. If a low level of NMD could be deleterious for the cell because some 
mRNAs would be expressed and would generate harmful proteins, it is 
more surprising to imagine that too efficient NMD is not a situation al¬ 
lowed in cells.That suggests some targets for NMD have to escape for a yet 
unknown reason. However, it has already been reported that NMD reduces 
the level of a PTC-containing mRNA to 5—25% of the level of the cor¬ 
responding wild-type niRNA (Kuzmiak and Maquat, 2006), but most of 
mRNAs escaping from NMD are not translated (You et ah, 2007). 

3.6.2 Tissue Specificity 

Interestingly, the efficiency of NMD has been shown to be regulated accord¬ 
ing to the tissue or the cell type (Bateman et ah, 2003;Viegas et ah, 2007). 
Indeed, some PTC-containing mRNA levels are found to be variable from 
one tissue to another, as it is the case for the collagen X in patients with 
Schmid metaphyseal chondrodysplasia, for which NMD was efficient in 
cartilage cells and absent in noncartilage cells (Bateman et ah, 2003). An 
explanation could be found at the level of NMD factors or related factors, 
since a comparison of them in different cell types showed that the amount 
of RNPSl, for example, correlates with the efficiency of NMD, suggesting 
that a measure of RNPSl could be a marker of the efficiency of NMD in 
a cell type or in a tissue (Viegas et ah, 2007). 

3.6.3 Inhibition During Apoptosis 

Under specific cell conditions, NMD can also be regulated. Very recently, 
it has been demonstrated that NMD is inhibited during apoptosis due to 
the cleavage by caspases of the NMD factors UPFl and UPF2, at least, but 
not UPF3X (Jia et ah, 2015; Popp and Maquat, 2015). The specific cleavage 
of UPFl and UPF2 suggests that the action of caspases against NMD dur¬ 
ing apoptosis is targeted, rather than dictated by a nonselective cleavage of 
components of a pathway. To support this idea, the N-terminal fragments 
of UPFl or UPF2 generated by caspases have a slight but reproducible 
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apoptotic effect (Jia et al., 2015). Interestingly, the NMD inhibition occur¬ 
ring during apoptosis leads to the expression of genes regulated by NMD, 
and proapoptotic for some of them (Popp and Maquat, 20 15). These studies 
suggest that NMD has to be inhibited during apoptosis, and plays an active 
role in ceU death progression. 

3.6.4 miRNA 

MicroRNAs (miRNAs) regulate a wide range of genes, either by affecting 
the stability of the niRNA or by inhibiting translation on the mRNA that 
it targets (Bartel, 2009;Jonas and Izaurralde, 2015). Each miRNA targets a 
set of genes, and several miRNAs can influence the expression of the same 
gene. Among these miRNAs, miR128 regulates the expression of UPFl 
(Bruno et ah, 2011). 

The regulation of NMD by the miRNA pathway is likely not only 
restricted to the regulation of UPFl expression, since the protein AG02, a 
catalytic subunit of the RISC (RNA-induced silencing complex), preferen¬ 
tially binds mRNAs carrying EJCs and CBP80/CBP20, that is, before the 
pioneer round of translation, so before PTC can elicit NMD. This binding 
blocks translation on the bound mRNA and prevents its decay by NMD 
(Choe et al.,2010,2011). It is therefore expected that some NMD substrates 
escape from NMD; thanks to miRNAs recognition and binding (Note 1 .2). 


NOTE 1.2 Repression Mechanism of Gene Expression by 
miRNAs 

MicroRNAs (miRNAs) are small noncoding RNAs of about 22 nucleotides long. 
They repress gene expression by inhibiting translation and/or promoting mRNA 
decay. For that, the miRNA generally binds to the 3' UTR of its target mRNA and 
recruits an argonaute protein and a GW182 protein in order to form the micro 
RNA-induced silencing complex (miRISC). The miRISC inhibits translation by in¬ 
terfering with the assembly or the function of the translation initiation complex 
elF4F (composed by elF4G, elF4E, and elF4A). The GW182 protein recruits the 
pan2/pan3 deadenylase complex and the CCR4-Not 3'-5' exoribonuclease com¬ 
plex, and also the decapping complex and the 5'-3' exoribonuclease XRNl, in 
order to degrade the miRNA target mRNA. About 1500 miRNAs are present inhu¬ 
man cells and each miRNA targets hundred mRNAs, suggesting that a significant 
proportion of human genes are regulated by miRNAs. MiRNAs are also regulated 
in order to silence the expression of specific genes at the accurate moment and 
for a specific period of time. Indeed, the translation on some mRNAs can be re¬ 
pressed by miRNAs and then be subject to a bulk of translation. 
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3.6.5 Phosphorylation 

Posttranslational modifications play a crucial role in NMD and in particu¬ 
lar phosphorylation. UPFl and UPF2 are phosphorylated and, at least for 
UPFl, the phosphorylation and dephosphorylation steps are required for 
the activation of NMD. The phosphorylation of UPFl occurs on sever¬ 
al serine/threonine-glutamine motifs (at least T28, S1078, S1096, S1116) 
thanks to the kinase SMGl (Matsuoka et ah, 2007; Ohnishi et ah, 2003; 
Okada-Katsuhata et ah, 2011; Yamashita et ah, 2001). The dephosphory¬ 
lation of UPFl occurs in a protein complex composed by the proteins 
SMG5, SMG7, SMG6, and the protein phosphatase 2A (PP2A). 

UPF2 is also a phosphoprotein (Chiu et ah, 2003), suggesting that the 
functions of that protein could be modulated via the posttranslational pro¬ 
cess of phosphorylation and dephosphorylation. However, this regulation 
has not yet been studied in mammals, making the role of UPF2 phosphory¬ 
lation in NMD hypothetic. 

3.6.6 Regulation by Availability of NMD Factors 

Many if not aU proteins involved in NMD play multiroles in cells. If a 
pathway becomes very active and requires the essential amount of fac¬ 
tors also involved in NMD, making the amount of these factors limiting 
for other pathways, such situation is expected to have a consequence on 
the efficiency of NMD. The best example to illustrate this is the protein 
UPFl that plays a role in different pathways, and in particular in NMD and 
Staufen-mediated mRNA decay (SMD) (Kim et ah, 2005). UPFl interacts 
with both UPF2 (an NMD factor) and Staufen (an SMD factor), and both 
proteins share the same binding site on UPFl (Gong et ah, 2009). Indeed, 
if the level of UPF2 decreases in the cell, SMD will become more efficient 
and, in contrast, NMD will be less efficient. To exemplify the relative link 
between NMD and SMD efficiency, that competition has been reported to 
occur during myogenesis, for instance. During such differentiation, UPFl is 
found to interact more with Staufen than with UPF2, favoring SMD and 
impacting NMD (Gong et ah, 2009). 

3.7 UPF2, UPF3X/UPF3b Independent Pathway 

Although many proteins have been identified as essential factors of NMD, 
such as UPF and SMG proteins, some NMD reactions do not always re¬ 
quired all these proteins. In particular, UPF2 has been shown to be dispens¬ 
able, according to the composition of the EJC (Gehring et ah, 2005) . Indeed, 
if the EJC is lacking in RNPSl, NMD can occur even in the absence of 
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UPF2. In contrast, the presence of RNPSl in the EJC requires UPF2 to acti¬ 
vate NMD. Although the composition ofEJCs seems to be homogenous on 
endogenous mRNAs (Singh et al., 2012), a possibility of heterogeneous EJC 
composition cannot be definitively excluded and would support the idea 
that NMD can be activated by different pathways, involving UPF2 or not. 

UPF3X/UPF3b has also been shown to not be absolutely required 
for NMD since downregulating this factor does not impair NMD (Chan 
et ah, 2009). This can be explained by the fact that UPF3/UPF3a, the prod¬ 
uct of the paralog gene of UPF3X/UPF3b, can functionally replace UP- 
F3X/UPF3b. However, as it is going to be explained in the next chapter, 
the replacement of UPF3X/UPF3b by UPF3/UPF3a is only partial, and a 
lack of UPF3X/UPF3b leads to severe retard mental disorders, suggesting 
that the replacement might not be efficient in all types of cells, tissues, or 
NMD reactions. 

The fact that some UPF proteins can be dispensable in some NMD re¬ 
actions demonstrates that NMD can be activated by different pathways.The 
role, the specificity, the composition, and the overlapping functions between 
these different pathways remain widely unknown. 

3.8 Pathologies Associated with NMD Defaults 

In mammals, the abolition of NMD is lethal as demonstrated in mice by 
a knockout of UPFl gene that leads to an embryonic lethality at 3.5 day 
post coitum (dpc) (Medghalchi et ah, 2001). UPFl is not the only NMD 
factor to be shown to induce embryonic lethality when it is missing. For 
instance, mouse embryos lacking in SMGl expression die around 8.5 dpc 
(Mcllwain et ah, 2010). Although this lethality cannot be exclusively allo¬ 
cated to NMD impairment since both UPFl and SMGl proteins have been 
shown to participate in different cellular pathways (see chapter: General As¬ 
pects Related to Nonsense Mutations; Sections 3.2.1 and 3.2.5). It has been 
shown that NMD plays a crucial role in the development of tissues and or¬ 
gans. For example, the expression of a dominant negative version of UPFl 
(UPFl R843C) impairs NMD and promotes a developmental arrest of fetal 
thymocytes (Frischmeyer-Guerrerio et ah, 2011). Thymocytes are expected 
to be affected by a lack of NMD since the locus encoding the T-ceU recep¬ 
tor is subject to a programmed DNA rearrangement and two-thirds of these 
rearrangements lead to the introduction of a PTC (Mallick et ah, 1993). As 
another example, in a conditional UPF2 knockout mouse, UPF2 has been 
demonstrated to be required for terminal liver differentiation during the 
development and for liver regeneration in adult liver (Thoren et ah, 2010). 
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It is therefore expected that human pathologies associated with a default in 
NMD factors are rare. 

Since some pathologies associated with a lack of the function of one 
NMD factor have been described, cells sometimes find a way to compen¬ 
sate the missing function. Indeed, UPF3X/UPF3b can be dispensable for 
the embryogenesis survival, likely due to the presence of the UPF3/UPF3a 
protein that can functionally replace UPF3X/UPF3b (Chan et ah, 2009). 
However, the replacement of UPF3X/UPF3b by UPF3/UPF3a is not total, 
and causes the development of pathologies. For example, a point muta¬ 
tion transforming the Lysine 367 into an asparagine in UPF3X/UPF3b 
gene leads to an X-linked intellectual disability (Nguyen et ah, 2012; 
Tzschach et ah, 2015) and other neurological disorders, such as schizo¬ 
phrenia, autism, or attention-deficit hyperactivity (Addington et ah, 2011; 
Laumonnier et ah, 2010;Tarpey et ah, 2007). To understand the mechanism 
behind the development of neuronal-related pathologies with a default in 
the UPF3X/UPF3b functions, the expression of UPF3X/UPF3b has been 
studied. Indeed, UPF3X/UPF3b is expressed during the brain development 
at about 10 folds higher than UPF3/UPF3a (Jolly et ah, 2013). In the same 
study, the authors also show that under the loss of UPF3X/UPF3b, the neu¬ 
ral progenitor cells divide more and differentiate less than in the presence 
of UPF3X/UPF3b, suggesting a role of this protein in the cell differentia¬ 
tion program. A transcriptomic analysis also revealed that in the absence of 
UPF3X/UPF3b, 16 genes highly expressed in neuron cells are deregulated 
(SIX3,TMOD2, NRCAM, or ROBOl for instance), indicating the impact 
in some gene expression of the absence of UPF3X/UPF3b. 

Sometimes the mutation at the origin of the pathology does not affect 
a gene involved in NMD directly, but is located in a gene that regulates 
the expression of an NMD factor. It is, for example, found in some cases 
of craniofacial dysmorphisms in which a nonsense mutation was found in 
the SATB2 gene, leading to the generation of a truncated SATB2 protein 
with a dominant negative effect, since this protein dimerizes to form a 
functional transcriptional complex. One of the targets of SATB2 is the 
UPF3X/UPF3b gene, explaining why the patients with that mutation have 
the same phenotype as patients with mutations in the UPF3X/UPF3b 
gene (Leoyklang et ah, 2013), and explaining why the truncated protein is 
synthesized from a PTC-containing mRNA, since NMD is inhibited. 

UPF3X/UPF3b is not the only gene involved in NMD that can be 
found as the origin of a human pathology and in particular in intellec¬ 
tual disorders. A genome analysis of patients with intellectual disorders 
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reveals that, besides mutations in UPF3X/UPF3b, genomic alterations can 
be found in UPF2, UPF3/UPF3A, SMG6, elFdAIII, or RNPSl (Nguyen 
et al., 2013). These genes can be likely found in patients because their 
inactivation does not tremendously affects cell viability. It is also likely 
because NMD can be activated by different combinations of proteins, as 
we saw, before making possible the impairment of one of them, without 
affecting significantly the expression of genes using NMD to regulate their 
expression. 

4 CORRECTION OF NONSENSE MUTATIONS, A CASE 
OF TARGETED THERAPY 

Before the development of personalized or targeted medicine, all patients 
with the same pathology received generally the same treatment. By deci¬ 
phering the molecular defaults at the origin of genetic diseases, it became 
necessary to design treatments according to the molecular event induc¬ 
ing the pathology, in order to improve safety and efficiency. The genetic 
background is also a key component of personalized medicine, since it has 
already been shown to influence the efficacy of a treatment, explaining why 
the same drug promotes variable effects on a cohort of patients (Cazzola 
et ah, 2015; De Mattia et ah, 2015; Graziani and Nistico, 2015). Molecular 
pathways such as NMD have been demonstrated to be modulated from 
one patient to another, providing also an explanation at least on the influ¬ 
ence of the genetic background for genetic diseases related with nonsense 
mutations (Haas et ah, 2015; Linde et ah, 2007; Nguyen et ah, 2013;Viegas 
et ah, 2007;Welch et ah, 2007). 

Nonsense mutations are adapted to targeted therapies because such mu¬ 
tations can be found in any ORFs, and the molecular consequences are 
shared for all mutant genes, unlike a missense, a deletion, or insertion mu¬ 
tations, for example, that will be dependent of the gene and the mutant 
protein. Indeed, the molecular consequences of a nonsense mutation are 
the silencing of the mutant gene, due to the specific and fast decay of the 
corresponding niRNA by NMD. Therefore, the development of strategies 
to correct a nonsense mutation in a specific gene may apply to nonsense 
mutations in various genes, since such therapeutic strategy will be devel¬ 
oped independently of the function of the mutant gene. However, as it will 
be described later, some parameters related to the position of the muta¬ 
tion influence the correction efficiency and in particular when readthrough 
strategy is involved. 
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Correction of nonsense mutations aims to rescue the expression of 
genes harboring a nonsense mutation that will apply to a fraction of patients 
with a genetic disease. Although, based on the in vitro and ex vivo data, 
molecules capable of rescuing the expression of genes harboring a nonsense 
mutation show variability in their efficiency according to the cell type or 
the nucleotide environment around the mutation, it is expected that one 
molecule will be able to treat many patients. For all these reasons, therapies 
focusing on the correction of nonsense mutations are targeted therapies, 
rather than a development of personalized medicine. 
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A genetic disease is defined as a pathology related with a mutation at least 
in one gene that is responsible for the disorder. The mutation can lead to 
the total absence of expression of the mutated gene (null mutation or loss 
of function mutation), to an increase of the wild-type function, or of a new 
function of the mutated protein (gain of function mutation), or a positive 
or negative modulation of the expression of the mutated gene. Human 
genome encodes about 20,000-25,000 genes that can be mutated during 
DNA replication, DNA repair, or after the insertion or the deletion of 
transposable elements. However, to date, about 8000 genetic pathologies 
have been described, suggesting that the other genes either do not have a 
pathologic effect, or are lethal when they are mutated, explaining why no 
pathologies are associated with a modification of their expression. It is also 
possible that the absence of expression of a gene is asymptomatic. 

Among the 8000 genetic diseases are found rare genetic diseases, such as 
cystic fibrosis (CF), Duchenne muscular dystrophy (DMD), hemophilia, or 
dwarfism, for instance, and frequent genetic diseases such as cancer, meta¬ 
bolic diseases, or neurologic syndromes in which a specific genetic default 
is often found at the origin of the pathology. 

Mutations at the origin of a pathology can be very diversified: point mu¬ 
tation leading to the change of one amino acid to another one (missense 
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mutation), change of one amino acid into a stop codon (nonsense mutation), 
deletion or insertion of one nucleotide in the open reading frame (ORF) 
leading to a frameshift mutation —1 or +1, respectively. Besides the point mu¬ 
tations, there are also found all the small and large insertions or deletions that 
are due to replication or transcriptional mistakes, retrotransposon insertion or 
deletion, any recombination events, or chromosomal migration default. Often, 
these mutations lead to the generation of a nonfunctional gene product, or 
to an absence of expression of the mutant gene. However, mutations can also 
generate the synthesis of a new product with a function deleterious for the cell. 

1 RARE DISEASES 

A rare disease is a pathology affecting less than 1 person in 1500 in North 
America, or in 2000 in Europe, for instance, meaning that 30 million Euro¬ 
peans (corresponding to about 8% of the European population) are affected 
by a rare disease. In 80% of cases, the origin of the pathology is a genetic 
default and, in 20%, the origin can be a viral or a bacterial infection, or an 
environmental effect that can induce the pathology. More than 8000 rare 
diseases have been described and, in most of cases, no treatments are avail¬ 
able explaining why rare diseases are also called orphan diseases. Seventy 
five percent of rare disease patients are children, and the life expectancy 
does not exceed the age of 5 years in 30% of cases (source: www.eurordis. 
org). Among these patients with a rare disease due to a genetic cause, about 
11% harbor a nonsense mutation in the gene responsible for the pathology 
(Mort et ah, 2008). Correction of the nonsense mutation can potentially 
apply to about 10% of patients from the 8000 rare diseases, since they are 
generally monogenic. Three different rare diseases are going to be described 
later, since they will be used as example in further chapters. 

1.1 Duchenne Muscular Dystrophy (DMD) 

DMD is a pathology affecting primarily boys, since the gene responsible for 
the pathology is carried by the short arm of the X-chromosome at the Xp21 
locus. The frequency of DMD is one affected boy in 3500 new born boys 
(Emery, 1993). Mutations affecting the dystrophin gene are responsible for the 
pathology named Duchenne muscular dystrophy, for the most severe form, 
and Becker muscular dystrophy (BMD) for the attenuated version of the pa¬ 
thology. BMD patients have a quality of life almost similar to healthy persons 
and, in some cases, do not even know that they are affected, unlike DMD 
patients who are subjected to muscle mass decrease leading to the requirement 
of a wheelchair by the age of 12, and to life expectancy of around 20 years of 
age. The difference between DMD and BMD is that in DMD no dystrophin 
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Figure 2.1 Organization of the dystrophin gene. On the top are indicated the promot¬ 
ers, with their corresponding tissue specificity in the lower part. The position of some 
dystrophin exons is shown on the top of the gene, and some distances in kilobases (kb) 
are provided under the dystrophin gene. 


function is detected in these patients when in patients with BMD the func¬ 
tion of dystrophin is attenuated often because a mutation decreases the level of 
dystrophin expression or the level of dystrophin is similar to wild-type but the 
dystrophin protein is partially functional. 

The dystrophin gene is the largest gene in the human genome, covering 
2.5 megabases (Coffey et al., 1992; Monaco et al., 1992). This gene encodes 
79 exons, and at least seven different promoters have been described that lead 
to the synthesis of a dystrophin with variable length (Fig. 2.1). Each promoter 
is expressed in a tissue- or development-specific manner. The main tissues in 
which dystrophin has been found are in skeletal, cardiac, and smooth muscle, 
and at lesser extend in neuron cells (Feener et al., 1989; Nudel et al., 1989). 
Three promoters, located at more than 320 kilobases from the exon 2, drive 
the expression of the largest version of dystrophin by incorporating a spe¬ 
cific exon 1. One promoter expresses dystrophin in the brain (promoter b) 
(Nudel et al., 1989), another one in muscle and gHal cells (promoter m), and 
finally the last one expresses the biggest isoform of dystrophin in Purkinje 
cells located in the brain (promoter p) (Gorecki et al., 1992). The dystrophin 
gene is transcribed into a pre-mRNA subject to alternative splicing, in par¬ 
ticular in the last exons, and generates a main 14 kb-long mRNA. 

The full size protein synthesized from dystrophin mRNA is a huge pro¬ 
tein of 427 kDa, related to the polymerization of 3685 amino acids.The dys¬ 
trophin protein localizes under the cell membrane in the cytoplasmic side of 
the sarcolemma, and is a part of the muscle cytoskeleton. Dystrophin has been 
found in other tissues than muscle, including the brain, where it concentrates 
at the post-synaptic area, in particular in a subset of GABAergic-synapses 
(Fritschy et al., 2012; Lidov et al., 1990). Several domains compose dystrophin, 
starting with an actin-binding site at the N-terminal part (Byers et al., 1989; 
Fabbrizio et al., 1995; Koenig et al., 1988). The central rod domain is made of 
24 spectrin-Hke triple helical repeats, interrupted by four proHne-rich spacer 
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domains (Arahata et al., 1988; Cross et al., 1990; Koenig and Kunkel, 1990; 
O’Brien and Kunkel, 2001; Roberts, 2001). The rod domain is followed by a 
WW domain and a cysteine rich domain, necessary for the interaction with 
the dystroglycan and syntrophin proteins (Huang et al.,2000; Ishikawa-Sakurai 
et al., 2004; Jung et al., 1995; Ponting et al., 1996; Rentschler et al., 1999; 
Suzuki et al., 1994; Winder et al, 1995). Finally, the C-terminal domain 
responsible for the interaction with the dystrophin-associated glycoproteins, 
which links the cytoskeleton and the extracellular matrix via dystrophin, 
also includes some putative sites for phosphorylation (Lederfein et al., 1993; 
Milner et al., 1993; Zubrzycka-Gaarn et al., 1988) (Fig. 2.2). 

Besides the full size dystrophin, shorter versions have been described, 
and called dystrophin protein (Dp), followed by a number corresponding 
to their molecular weight (Fig. 2.3). These shorter versions of dystrophin 
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Figure 2.2 Cellular localization of dystrophin and main partners involved in the muscle 
cytoskeleton. Dystrophin localizes under the cell membrane, and interacts with actin 
filaments on one side, and with p-dystroglycan, a-dystrobrevin, and syntrophin com¬ 
plex on the other side. Thanks to its interactions, dystrophin links the cytoskeleton to 
the extracellular matrix. 
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Figure 2.3 Organization of the functional domains of dystrophin isoforms. The dif¬ 
ferent domains of dystrophin are indicated, starting by the actin-binding domain, in 
black, at the N-terminal end of the protein, the rod domain, in blue, composed of 24 
repeats of a spectrin-like motif found in the spectrin protein, including four prolin-rich 
linkers (black boxes named from 1 to 4), the WW protein-protein interaction domain, a 
cysteine-rich domain composed by a domain EF and a domain ZZ, both also involved 
in protein-protein interaction, and a coiled coil (cc) domain, also involved in protein- 
protein interaction. The N-terminal part of the dystrophin is in charge of the interaction 
with the cytoskeleton, via its actin-binding domain, and the C-terminal part of dystro¬ 
phin establishes the interaction with the dystrophin-associated proteins (DAP) such as 
the p-dystroglycan, a-dystrobrevin, and the syntrophins. 

are produced thanks to the use of internal promoters that introduce a spe¬ 
cific first exon spliced to the exon 30 (Dp260), the exon 45 (Dpl40), the 
exon 56 (Dpll6), or the exon 63 (Dp71). Dp260 is mainly expressed in 
the retina (D’Souza et ah, 1995), Dpl40 expresses in the fetal brain and, 
to a lesser extent in the adult brain, in the retina, and in the kidneys (Bar- 
doni et ah, 2000; Lidov et ah, 1995), Dpi 16 expresses in adult peripheral 
nerves (Byers et ah, 1993; Labarque et ah, 2008), and Dp71—the major 
dystrophin isoform—in the central nervous system, and is also expressed 
in various non-muscle tissues (Austin et ah, 1995, 2000; Bar et ah, 1990; 
Blake and Kroger, 2000; Ceccarini et ah, 2002; Chamberlain et ah, 1988; 
Daoud et ah, 2009; Greenberg et ah, 1996; Holder et ah, 1996; Huard and 
Tremblay, 1992; Ilarraza-Lomeli et ah, 2007; Lederfein et ah, 1992; Miyatake 
et ah, 1991; Rapaport et ah, 1992). Although Dp71 has a quite ubiquitous 
expression profile, it is not expressed in skeletal muscle cells. According to 
the promoter used to generate the dystrophin pre-mRNA, the functional 
domains included in the protein are different (Fig. 2.3). 
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Nonsense mutations have been found in 8.7% of French patients with 
DMD, distributed as follows: 4.3% ofTGA (i.e., 49.5% of the nonsense 
mutations), 2.7% ofTAG (i.e., 31% of the nonsense mutations) and 1.7% of 
TAA (i.e., 19.5% of the nonsense mutations) (database-DMD). Such distri¬ 
bution fits perfectly the general distribution of nonsense mutations in hu¬ 
man pathologies (Atkinson and Martin, 1994). In a Japanese DMD database, 
nonsense mutations represent 19% of aU mutations, showing some variabil¬ 
ity according to the studied population (Takeshima et ah, 2010). Correction 
of nonsense mutation represents a therapeutic strategy of interest for pa¬ 
tients with DMD, since the consequences of the mutation are thought to be 
reversible. Indeed, it is believed that the damages induced by the absence of 
wild-type dystrophin can be reversed, if a functional dystrophin protein is 
reintroduced in cells.The cure by the re-expression of the missing protein is 
expected for most genetic diseases, except the ones affecting the expression 
of genes required during development, for instance. For that last situation, a 
treatment would be necessary during embryogenesis and fetal development. 
Such treatment would have to pass the placenta barrier, and would neces¬ 
sitate the detection of the genetic pathology as early as possible before birth. 

Another attractive point for the development of a strategy to correct 
nonsense mutations in order to cure DMD comes from the fact the C- 
terminal part of the protein encoded by the exon 70 until the exon 79 is 
not essential for the function of the protein, at least in muscle cells (Craw¬ 
ford et ah, 2000). This means that inhibition of nonsense-mediated niRNA 
decay (NMD) only to allow the synthesis of a truncated dystrophin pro¬ 
tein, when a premature termination codon (PTC) is present in these exons, 
would be a sufficient therapeutic strategy. 

1.2 Cystic Fibrosis (CF) 

CF is a recessive pathology related to a dysfunction of the cystic fibrosis 
transmembrane conductance regulator (CFTR) gene (Kerem et ah, 1989; 
Riordan et ah, 1989; Rommens et ah, 1989). The CFTR gene is almost 190 
kilobases, located on the long arm of the chromosome 7, and it encodes 27 
exons.The CFTR mRNA is 6129 nucleotides long, and encodes a protein 
of 1480 amino acids with a molecular weight around 170 kDa. 

The CFTR protein is a member of the ATP binding cassette (ABC) su¬ 
perfamily (ABCC7) that is a family sharing some structural properties, such as 
the presence of a transmembrane domain (TMD) and a nucleotide-binding 
domain (NBD), where ATP is recruited and hydrolyzed. More specifically, 
CFTR is composed of 5 domains: 2TMDs are present (TMDl andTMD2), 
and each one includes 6 transmembrane a-helices; 2 NBDs (NBDl and 
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NBD2) and one regulator domain (R) harboring several phosphorylation 
sites substrates for the cAMP-dependant protein kinases A and C (Fig. 2.4). 
Both NBD domains bind ATP, thanks to ATP consensus binding sites called 
Walker A and Walker B (Walker et ah, 1982). CFTR is a regulated chan¬ 
nel that chloride from inside to outside the cell. The absence of chloride 
transport leads to an increase of the viscosity of the mucus, in particular in 
lungs. The function of the lungs is then impaired, and bacteria proliferation 
is facilitated in such environment found in CF patient lungs. 

The CFTR protein is post-translationaUy modified and, in particular, 
it is glycosylated before being targeted on the apical plasma membrane of 
the cell. For that, CFTR is subject to different steps of maturation in differ¬ 
ent compartments. First, the CFTR protein is synthesized and immediately 
delivered to the endoplasmic reticulum (ER). In the ER, CFTR is folded 

crt 



Figure 2.4 Schematic representation of the CFTR protein. CFTR is a transmembrane 
channel transporting chloride from inside to outside of the cell (red arrow). It is com¬ 
posed of twoTMDs (TMDl andTMD2), two NBDs (NBDl and NBD2), responsible for the 
catalysis of the ATP, and a regulator (R) domain linking the two identical hemiparts of 
the protein. The R domain is also substrate for the protein kinases A and C. 
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and core-glycosylated, in order to give an immature isoform of CFTR, 
identified by western-blot as band B, with a molecular weight around 145 
kDa.Then, CFTR is targeted on the Golgi apparatus, where it is subject to 
an additional glycosylation in order to generate the CFTR species known 
as western-blot product band C, of about 170 kDa. This mature form of 
CFTR then localizes to the apical plasma membrane (Amaral, 2005; Pranke 
and Sermet-Gaudelus, 2014) (Fig. 2.5). 

CF is the most frequent rare disease in the Caucasian population, with 
one case in about 3000 newborns. It is a recessive and autosomal disorder, 
with an average life expectancy around 37 years of age (O’Sullivan and 
Freedman, 2009). CFTR is expressed in most epithelium tissues, explaining 
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Figure 2.5 Expression pathway of the CFTR gene, from transcription to the celiular mem¬ 
brane iocalization. The CFTR gene is transcribed into CFTR pre-mRNA and, after matu¬ 
ration, into CFTR mRNA before being exported to the cytoplasm, where it is translated 
into a CFTR protein. The CFTR protein is then glycosylated in the endoplasmic reticulum 
(ER) before being exported to the Golgi apparatus, where the CFTR protein is subject to 
another round of glycosylation. CFTR is then fully matured, and can be targeted on the 
cell membrane. On the right side are mentioned the six main different categories of mu¬ 
tations affecting CFTR (purple box), an example of a mutation type (green box), and the 
consequence on CFTR (blue box). ORCC: outwardly rectifying chloride channel. 
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why CF affects various organs, such as lungs, pancreas, intestine, male re¬ 
productive organs, or bones, for instance (De Boeck et ah, 2006; Fanen 
et ah, 2014; FarreU et ah, 2008; Sermet-Gaudelus et ah, 2007). Although 
many organs are impaired in CF, lung dysfunction is the main cause of mor¬ 
tality for CF patients (Fanen et ah, 2014; O’SuUivan and Freedman, 2009). 
To date, 2000 different mutations have been found in CFTR, including 
the most frequent mutation in CFTR found in 70% of patients—a de¬ 
letion of the phenylalanine 508 (F508del) that affects the folding of the 
CFTR protein. Nonsense mutations represent 8.35% of all mutations in the 
CF mutation database updated in April 2011 (database-CF). In the UMD- 
CFTR French database, nonsense mutations reach 14% for CF patients, 
and 5.3% for patients with congenital bilateral absence ofvas deferens that 
is a syndrome present in patients with a mild mutation in CFTR, allowing 
an expression of about 10% of CFTR, compared to normal. Interestingly, 
overall in this database, 68% of nonsense mutations areTCA, 24% areTTVA, 
and 8% are TAC (source: UMD-CFTR http://www.umd.be/CFTR/). 
The distribution of the different nonsense mutations is original, since the 
less frequent nonsense mutation is TAC, and not TAA unlike what is found 
in other human pathologies (Atkinson and Martin, 1994). In addition,TCA 
nonsense mutation shows a large predominance, compared to the propor¬ 
tion found in other human diseases for which TCA nonsense mutation 
represents around 51% of the nonsense mutations. 

Mutations in CFTR have been organized into six different categories 
(Welsh and Smith, 1993). Class I is composed by aU mutations leading to the 
absence of synthesis of the CFTR protein. For example, nonsense mutations, 
frameshift mutations, and splicing defaults are found in this category, since 
the consequence is the introduction of a PTC that promotes the fast CFTR 
niRNA decay by NMD (Hamosh et al., 1991;Tsui, 1992). Class II regroups 
mutations impairing the processing of the CFTR protein. It is in this class that 
the F508del mutation is found, since it interferes with the folding of CFTR 
that concentrates in the ER, where it is degraded (Cheng et al., 1990;Thomas 
et al., 1992). In Class III are found all mutations affecting the regulation of the 
CFTR protein. Mutations located in the NBD, for instance, belong to that 
category. Indeed, the missense mutation C551D is found in this category, since 
it reduces strongly the function of CFTR, even though the protein localizes to 
the apical cell membrane. Class IV is for mutations affecting the conduction of 
the CFTR channel. Indeed, mutations reduce the ion flow through the chan¬ 
nel, or decrease the opening time of CFTR, limiting the exit of chloride ions, 
and leading to CF. At origin, only four classes have been described, then two 
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additional ones have been proposed, in order to complete the classification of 
all mutations affecting CFTR (Fanen et al., 2014) (Fig. 2.5). Class V concerns 
mutations decreasing the production of CFTR mRNA, such as mutations 
in the CFTR promoter inhibiting the transcription of CFTR, or mutations 
responsible for the activation of an alternative splicing event leading to the 
synthesis of nonfunctional or partially functional truncated CFTR protein, for 
instance (Hinzpeter et al., 2010; Hull et al., 1993). Finally, in class VI are found 
all mutations increasing the turnover of the functional CFTR protein, or in¬ 
hibiting the regulation of other ion channels by CFTR, such as outwardly 
rectifying chloride channel (ORCC), calcium-activated chloride channel, renal 
outer medullary potassium channel, or the epithelial Na+ channel (Boucher 
et al., 1986; Gabriel et al., 1993; Haardt et al., 1999;Jovov et al., 1995; Kunzel- 
mann et al., 1997; Mali et al., 2004; Stutts et al., 1997;Toczylowska-Maminska 
and Dolowy, 2012;Wei et al., 2001;Yoo et al., 2004). Mutations from classes I, 
II, III, and VI promote an absence or a very low level of CFTR expression, and 
are associated with a severe clinical phenotype, unlike mutations from classes 
IV and V with which a partial CFTR function remains and confers a mild 
clinical phenotype. It is worth noting that only 5% of CFTR mRNA from the 
wild-type level is enough to improve the pulmonary clinical phenotype (Ram- 
alho et al., 2002). However, with 10% of CFTR mRNA, compared to healthy 
persons, people may still develop some syndrome such as congenital bilateral 
absence of vas deferens in males, leading to azoospermia and sterility (source: 
UMD-CFTR http://www.umd.be/CFTR/). 

Although all the phenotypical characteristics might not be completely 
reversed, CF is a particularly well adapted pathology for the development of 
therapeutic strategies correcting nonsense mutations. Indeed, the efficiency 
of such strategies remains low until now (see chapter: Strategies to Correct 
Nonsense Mutations; Section 3), but the 5% of CFTR expression sufficient 
to reverse the main impairment in lung function seems to be accessible, since 
that level is already reached in different cell models, with various molecules 
providing a great hope for patients (Du et al., 2009; Manuvakhova et al., 2000) . 

1.3 Spinal Muscular Atrophy 

Spinal muscular atrophy (SMA) is an autosomal recessive disorder caused by 
mutations in the SMNl gene encoding the survival motor neuron (SMN) 
protein for the most common form, but 33 genes have been identified to be 
potentially the cause ofSMA,ofwhich 13 since 2011. SMA affects one new¬ 
born in 10,000 and represents the first inherited cause of infant mortality. 
The proportion of asymptomatic adult carriers is approximatively 1/50 
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(Sugarman et al., 2012). The consequences of SMA are muscle weakness and 
atrophy, motor neuron death in the spinal cord, and lower brainstem. 

In human, a paralog SMNl gene exists in the close vicinity of SMNl, 
and was named SMN2. SMN2 is present in variable copy numbers, from 
0 and up to 8 copies that are almost identical to SMNl, with the excep¬ 
tion of a silent point mutation transforming a cytosine into thymidine, 
in the beginning of the exon 7 of the SMN2 gene. The consequence of 
this mutation is to induce the exon 7 skipping, leading to the synthesis 
of an unstable truncated SMN protein that is the major SMN isoform in 
patients with SMA, due to the inhibition of SMNl expression because of 
mutations in this gene (Monani et ah, 1999). Indeed, this single nucleo¬ 
tide change abolishes an exonic splicing enhancer, and creates an exonic 
splicing silencer bound by the heterogeneous nuclear ribonucleoprotein 
Al (Cartegni and Krainer, 2002; Kashima et ah, 2007) (Fig. 2.6). Since a 
small fraction of SMN2 transcripts includes the exon 7 in the mRNA, 
the phenotype severity of SMA patients is related to the copy number of 
the SMN2 gene: the more copies of the SMN2 gene are present, the less 
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Figure 2.6 Expression profile of the SMNl and SMN2 genes. The SMNl gene harbors an 
exonic splicing enhancer in the exon 7, recognized by SR proteins, and leading to the 
incorporation of the exon 7 in the mRNA, in order to synthesize a functional full length 
SMN protein. The SMN2 gene is mutated in the exon 7, with a transition of a cytosine 
into a thymidine. That mutation changes the exonic splicing enhancer into an exonic 
splicing silencer, leading to the exon 7 skipping during splicing. The translation of such 
mRNA generates a nonfunctional internally truncated SMN protein. 
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affected is the patient (Feldkotter et ah, 2002; Lefebvre et ah, 1997; Prior 
et ah, 2004). 

The SMN protein localizes in the nucleus, as well as in the cytoplasm. 
SMN stably interacts with gemin proteins 2—6 to form the SMN complex 
(Paushkin et ah, 2002) involved in various mechanisms in the ceU, including 
a role in the synthesis, maintenance, and recycling of Sm proteins (Fischer 
et ah, 1997; Meister et ah, 2001; Pellizzoni et ah, 2002), and in the formation 
of the U snRNPs (Chari et ah, 2008). Indeed, seven Sm proteins (B/B’, Dl, 
D2, D3, E, F, and G) are found in the U snRNPs (Ul, U2, U4, U5, Ull, 
U12, and U4atac), which are major components of the spliceosome (Lerner 
and Steitz, 1979;Will and Luhrmann,2001). Interestingly, the SMN protein, 
via its function on Sm protein regulations, affects its own expression, since 
a decrease in the level of SMN protein favors the skipping of the exon 7 of 
its own pre-mRNA (Jodelka et ah, 2010; Ruggiu et ah, 2012). 

Therapeutic strategies developed to treat SMA concentrated on influ¬ 
encing the splicing of the exon 7 of SMN2, in order to restore the expres¬ 
sion of the SMN protein. Some of these strategies wiU be described in 
chapter: Strategies to Correct Nonsense Mutations. 

2 FREQUENT DISEASES 

A pathology is considered as a frequent disease when its incidence is higher 
than 1 in 3000 newborns.These frequent disorders are often multifactorial, 
and these factors can be environmental, infectious, or genetic, for example. 
Such frequent diseases might also require mutations in several genes to de¬ 
velop. To illustrate the involvement of the genetic in frequent diseases, three 
families of frequent pathologies wiU be described (cancer, metabolic, and 
neurologic diseases) via their genetic origin. 

2.1 Cancers 

Cancers are defined by an anarchic cellular proliferation that forms a tumor 
within a tissue of an organism. Cancer cells have the capacity to divide end¬ 
lessly and to become immortal. Tumor cells acquire the capacity to migrate 
and to colonize other tissues to form metastasis. Multiple factors can be at 
the origin of a cancer, such as the exposition to radiations (UV, or radioac¬ 
tive, for instance), to stress, to an abnormal hormonal level, to a pathogen, 
or a cancer-inducer chemical. Mutations in a specific gene can also induce 
cancer development. For example, mutations in tumor suppressor genes can 
cause cancer, since those genes are often involved in cell cycle regulation. 
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and in particular in the arrest of ceU cycle and, eventually, in the activa¬ 
tion of apoptosis. One of the most studied tumor suppressor genes encodes 
the tumor protein p53 (TP53) that has been found mutated in more than 
50% of aU human cancers (Carson and Lois, 1995). P53 is a multifunction 
protein, since it is at least a transcriptional factor, a ceU cycle regulator, and 
an apoptosis activator (Vousden and Prives, 2009). P53 plays a surveiUance 
role on the integrity of the genome. If the genomic DNA is damaged after 
radiations or chemical exposure, for example, p53 wiU be involved in the 
decision to frx the damages by stimulating the expression of the DNA repair 
factors, or to induce ceU death by blocking the ceU cycle, and activating the 
expression of apoptotic genes. 

Another weU-studied tumor suppressor gene is the breast cancer 
susceptibility gene 1 (BRCAl).The BRCAl protein also plays a role in the 
response to DNA damages, and in particular DNA double strand breaks, 
by either stimulating the DNA repair or by favoring ceU death (Yoshida 
and Miki, 2004). In case of DNA damage, the BRCAl protein becomes 
hyperphosphorylated, and localizes at the DNA replication forks (Scully 
et ah, 1997;Thomas et ah, 1997). BRCAl together with BRCA2 stimu¬ 
late the double strand DNA repair machinery involving homologous re¬ 
combination. By interacting with the RNA polymerase II, BRCAl is also 
involved in transcription, and stimulates the expression of genes such as 
p21 or GADD45, proteins involved in ceU cycle progression indicating an 
indirect role of BRCAl in ceU cycle control (Li et ah, 2000), consistent 
with its requirement for the S-phase and the G2/M-phase checkpoints 
(Cortez et ah, 1999). 

The phosphatase and tensin homolog (PTEN) is a tumor suppressor 
acting differently from p53 or BRCAl. PTEN dephosphorylates the phos- 
phatidylinositol 3,4,5 triphosphate (PIP3) into phosphatidylinositol 4,5 di¬ 
phosphate (PIP2), inhibiting the protein kinase B (Akt) pathway. Indeed, 
Akt needs the PIP3 to be addressed to the ceU membrane, and be activated 
by phosphorylation. Once Akt is phosphorylated, it can regulate ceU sur¬ 
vival, and inhibit ceU death (Song et ah, 2005), explaining how PTEN can 
be a tumor suppressor by inhibiting the activation of Akt. 

The last tumor suppressor gene that wiU be mentioned is the gene encoding 
the adenomatous polyposis cob (APC). Mutations in the APC gene increase 
the risk of developing a colon cancer. APC interacts with several key proteins, 
such as the beta-catenin, or tubulin forming the microtubules to repress ceU 
division, or to interfere with ceU shape and ceU mobUity (van Es et al., 2001). It 
is via those interactions that APC interfere with cancer progression. 
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The proportion of nonsense mutations affecting tumor suppressor genes 
is variable, according to the gene and the tissue. All the following distribu¬ 
tions can be found in the COSMIC library at the following website address: 
http://cancer.sanger.ac.uk/cosmic/. For instance, nonsense mutations have 
been found in 7.7% of the 2129TP53-mutated samples found in the COS¬ 
MIC library, in 11.4% of the 413 BRCAl-mutated samples, in 15.8% of 
the 3250 PTEN-mutated samples, and in 41.5% of the 4216 APC-mutated 
samples. For this latter tumor suppressor gene, the proportion of nonsense 
mutations falls to 12.5% in the central nervous system, showing the high 
variability of the distribution of a specific type of mutation, according to the 
tissue. In any case, the correction of nonsense mutations in cancer would 
benefit a significant proportion of cancer patients. 

Several lines of evidence demonstrate the attractiveness of nonsense mu¬ 
tation correction as a therapeutic approach to cancer. In the case of nonsense 
mutations affecting a tumor suppressor gene, the correction of the nonsense 
mutation will rescue the expression of the tumor suppressor gene. In that 
situation, correction of the nonsense mutation will come as a complemen¬ 
tary approach to a therapeutic treatment. Indeed, by rescuing the expression 
of the mutant tumor suppressor gene, the level of this latter will be reestab¬ 
lished to the physiologic level that is not the level promoting apoptosis, for 
example. In order to promote cell death, in addition to rescuing the mutant 
tumor suppressor gene expression, an apoptosis inducing treatment will like¬ 
ly be necessary. This apoptosis treatment will affect the expression level of the 
tumor suppressor gene, in order to reach the threshold triggering apoptosis. 
The correction of nonsense mutations is necessary to bring back the normal 
expression of the mutant tumor suppressor that can then be able to reach the 
apoptosis triggering level, under apoptosis stimuli. 

Correction of nonsense mutations, or precisely inhibition of NMD, can 
represent another interest in the treatment of cancer, even in the cancer with 
an origin that is not related to a nonsense mutation. A strong inhibition of 
NMD leads to the expression of natural substrates of NMD. Among them, 
some encode proteins involved in apoptosis, including GADD45a, GAD- 
D45b, or CDKNIA (Mendell et al., 2004; Popp and Maquat, 2015;Viegas 
et ah, 2007) that will favor the entrance in apoptosis, since these genes are 
repressed under physiological conditions, and would now become expressed 
after NMD inhibition. In addition, by dividing continuously, cancer cells 
increase the probability to get de novo mutations, including nonsense muta¬ 
tions (Campbell et ah, 2010; Pleasance et ah, 2010). Expressing suddenly a 
wide panel of truncated proteins might interfere with various physiological 
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processes, and therefore induces apoptosis. It is thus expected to find more 
mutations in cancer cells than in healthy cells, even inside the same organism, 
increasing the probability to have more nonsense mutations in cancer cells 
than in healthy cells. Based on that, inhibition of NMD might be more del¬ 
eterious in cancer cells than in healthy cells, in side of a cancer patient body. 

The third interest in the use of inhibition of NMD, for the treatment 
of cancer, is that particular approach might be the way to develop an im¬ 
munotherapy targeting cancer cells. Indeed, inhibition of NMD could lead 
to the synthesis of many proteins with some abnormal C-terminal end 
(Fig. 2.7) . For example, the translation of mRNA isoforms retaining a PTC- 
containing intron will generate a truncated protein, with a C-terminal part 
corresponding to the intronic sequence upstream to the PTC. All these new 
C-terminal fragments do not exist in healthy cells, and can be part of the 
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Figure 2.7 Possible specific development of an antitumor immune response by inhibi¬ 
tion of NMD. When NMD occurs (left panel), PTC containing mRNAs are degraded by 
NMD, excluding the presentation at the cell surface of tumor specific peptides, unlike 
when NMD is inhibited, since PTC-containing mRNAs are not quickly degraded, and can 
be translated into a protein. All the parts of these proteins can be presented at the cell 
surface, including some C-terminal ends translated from mRNAs, harboring a frameshift 
mutation or some intronic sequences. Such peptides will not be recognized as self-anti- 
gens, and will be responsible for the activation of the immune system. 
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peptide presented to the cell surface to the immune system.These abnormal 
C-terminal parts of proteins will then be recognized as nonself, and will 
activate an immune response. Since cancer cells are more susceptible to ac¬ 
cumulate mutations, under NMD inhibition they will present some specific 
antigens of cancer cells on their cell surface that can promote a specific 
antitumor immune response (Pastor et ah, 2010). 

2.2 Metabolic Diseases 

Metabolic diseases mean aU the pathologies that interfere with the conver¬ 
sion of food into energy. Metabolic diseases can be inherited, known as in¬ 
born errors of metabolism, or they can be acquired during the lifetime. The 
frequency of inherited metabolic diseases is less than 1 in 3000 newborns, 
making them rare diseases. For example, Gaucher’s disease affects 1/60,000 
of the worldwide population; it results from the lack of an enzyme called 
glucocerebrosidase, involved in the metabolism of a fatty substance called 
cerebroside.The frequency of that pathology is highly variable, according to 
the ethnicity. For example, it can reach 1/450 people among the Ashkenazi 
population, making Gaucher’s disease a very frequent pathology among this 
population. 

Other well studied inherited metabolic diseases include the Fabrys disor¬ 
der, with an estimated occurrence of 1/40,000 births, in which a deficiency 
of alpha-galactosidase A enzyme leads to the accumulation of lipids in various 
organs, such as kidney or heart, for instance. The type 1 mucopolysacchari¬ 
dosis, or Hurler syndrome, is another example of inherited metabolic pathol¬ 
ogy that affects 3-4000 people worldwide. It is a deficiency in L-iduronidase 
enzyme that leads to the pathology with damage in the heart, lungs, kidney, 
and central nervous system. L-Iduronidase is responsible for the metabolism 
of heparin sulfate, and other glycoaminoglycans such as dermatan sulfate. 
This enzyme localizes in the lysosomes that are organelles in charge of de¬ 
grading unwanted molecules in the cell (Appelqvist et ah, 2013). In case 
of L-iduronidase impairment, glycoaminoglycans concentrate in lysosomes, 
and in the extracellular matrix. The accumulation of dermatan sulfate inter¬ 
feres with elastic fiber assembly, explaining partially the phenotype found in 
Hurler syndrome (Hinek and Wilson, 2000). The main therapeutic strategy 
for the treatment of metabolic diseases is the enzyme replacement therapy 
that consists in delivering the missing enzyme to the cell. 

Nonsense mutations are found among mutations responsible for meta¬ 
bolic diseases. Nonsense mutation correction seems to be adapted to the 
treatment of metabolic diseases caused by nonsense mutations, since the 
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level of the enzyme to rescue often does not need to reach 100%. For 
example, in propionic acidemia mouse model, a rescue of less than 20% 
of the wild-type level of propionyl-coenzymeA carboxylase, the deficient 
enzyme in that pathology, is enough to prevent the neonatal lethality (Mi¬ 
yazaki et ah, 2001). It has also been shown that, in Hurler syndrome, a low 
amount—as low as 3% of the L-irudonidase—is sufficient to demonstrate 
a beneficial effect (Bunge et ah, 1998; Keeling et ah, 2001). Readthrough 
molecules have been tested on fibroblasts of patients with Hurler syndrome, 
and showed their efficiency in inducing expression of L-irudonidase, and in 
decreasing the concentration of glycoaminoglycans in lyzosomes (Keeling 
et ah, 2001; Wang et ah, 2012). The effect of aminoglycosides and non¬ 
aminoglycosides, such as ataluren/PTC124, on various inherited metabol¬ 
ic diseases caused by a nonsense mutation have been described positively 
for peroxisome biogenesis disorders, carnitine palmitoyl transferase defi¬ 
ciency, or branched-chain organic acidurias, including propionic acidemia 
(Buck et ah, 2009; Dranchak et ah, 2011; Hein et ah, 2004; Helip-Wool- 
ey et ah, 2002; Keeling et ah, 2001; Perez et ah, 2012; Sanchez-Alcudia 
et ah, 2012; Sarkar et ah, 2011;Tan et ah, 2011;Wang et ah, 2012). 

2.3 Neurologic Disorders 

Neurologic diseases group aU pathologies affecting the central and/or the 
peripheral nervous system, including brain, spinal cord, autonomic nervous 
system, neuromuscular junction, and also muscles. According to the World 
Health Organization, several hundred millions of people worldwide are af¬ 
fected by a neurologic disorder. More than 600 neurologic pathologies have 
been reported, with some very common, such as Alzheimer’s disease, affect¬ 
ing more than 30 million people worldwide (Querfurth and LaFerla, 2010), 
Parkinson’s disease, epilepsy, affecting more than 50 million people world¬ 
wide, or migraine that would affect 10% of the population worldwide. 

Nonsense mutations have been found in genes responsible for neurolog¬ 
ic disorders. For example, Rett syndrome is a pathology in which nonsense 
mutations in the methyl CpG binding protein 2 (MECP2) gene, the gene 
at the origin of the pathology when mutated (Lewis et ah, 1992), are found 
in 27% of Rett syndrome patients (source: RettBASE, http://mecp2.chw. 
edu.au/cgi-bin/mecp2/views/basic.cgi?form=amino-freq) . Unlike other 
neurologic disorders, Rett syndrome clinical phenotype does not prog¬ 
ress after the early childhood step (Naidu et ah, 2003). The clinical signs 
are a developmental arrest, a loss of language acquisition, and a transient 
autistic-like behavior. The MECP2 gene is carried by the X-chromosome, 
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and encodes for a transcription repressor, and could also be a transcription 
activator (Chahrour et al., 2008; Cohen et al., 2008). The protein MeCP2 
binds to methylated DNA to repress or eventually activate transcription 
of specific genes, in order to promote the maturation of the central ner¬ 
vous system, and to form synaptic contacts (Akbarian et al., 2001; Jung 
et al., 2003; Luikenhuis et al., 2004; Shahbazian et al., 2002). Rett syndrome 
affects girls with a prevalence of 1 in 10000, and rarely boys, likely because it 
is lethal in males (Samaco et al., 2004; Shahbazian and Zoghbi, 2002). Often, 
mutations occurring in the MECP2 gene arise de novo. Mutations in the 
MECP2 gene impair the function of the protein in a dominant negative 
way, explaining why it would be lethal in affecting boys, and not in girls 
who would survive thanks to the X-chromosome inactivation that could be 
the wild-type, or the mutated X-chromosome, leading to a mosaic person 
(Bienvenu et al., 2002; De Bona et al., 2000; Hoffbuhr et al., 2001, 2002; 
Zappella et al., 2003). A case study on monozygotic twins carrying muta¬ 
tions in the MECP2 gene illustrates how the process of X-chromosome in¬ 
activation can dictate the health status of a Rett syndrome patient. Between 
these two twins, one girl is asymptomatic, while her sister has a severe Rett 
syndrome phenotype. A peripheral blood cell analysis demonstrated that the 
inactivation rate of the X-chromosome is 99:1 in favor of the wild-type X- 
chromosome for the asymptomatic twin, and a ratio of 40:60 in favor of the 
mutant X-chromosome for the sister with the severe Rett syndrome phe¬ 
notype (Hoffbuhr et al., 2001). These data suggest that the severity of the 
pathology is related to the number of cells expressing the wild-type MeCP2 
protein. The dominant effect is only related to the fact that only one allele 
of the X-chromosome is expressed in females, and does not correspond to 
a gain of function of the mutant protein. That also explains why it would 
be lethal for boys, since there is only one copy of the X-chromosome, and 
if it carries a mutation in MECP2 gene, 100% of cells will not have the 
functional MECP2 function. Consequently, therapeutic approaches to res¬ 
cue the clinical phenotype of Rett syndrome patients will be challenging, 
and the success will depend on the ratio of cells that will respond to the 
treatment. Indeed, in the example described earlier, 40% of cells expressing 
the wild-type isoform of the MeCP2 protein are not enough to be asymp¬ 
tomatic. It means that patients expressing wild-type MeCP2, with a ratio 
of 1:99 in favor of the mutant MeCP2, will be more difficult to be success¬ 
fully treated, than patients already expressing 40% or 50% of the wild-type 
protein, suggesting a big difference in the response of putative treatments. 
In the case of nonsense mutation therapy by the readthrough approach, very 
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efficient compounds are needed since, to date, readthrough efficiency can 
reach only a few percent of the level of the wild-type protein (see chapter: 
Strategies to Correct Nonsense Mutations; Section 3). 

Another neurologic disorder, epilepsy, has been often related to non¬ 
sense mutations. Mutations in various genes lead to the development of 
epilepsy, such as y-aminobutyric acid receptor type A (GABA), leucine-rich 
glioma-inactivated 1 (LGIl), or NaVl.l neuronal sodium channel alpha- 
subunit (SCNIA) genes, for instance, illustrating the complexity of epi¬ 
lepsy syndromes. Indeed, several categories of epilepsy have been described, 
such as the relatively benign generalized epilepsy with febrile seizures plus 
(GEFS+), the severe myoclonic epilepsy in infancy (SMEI), also known as 
Dravet syndrome, the borderline SMEI (SMEB), the intractable childhood 
epilepsy with generalized tonic-clonic seizures (ICEGTCS), some rare cas¬ 
es of familial migraines, the autosomal dominant lateral temporal epilepsy 
(ADEXE), or the autosomal dominant partial epilepsy with auditory features 
(ADPEAF). Mutations in one gene can lead to the development of different 
types of epilepsy, as demonstrated for the SCNIA gene mutations that can 
cause GEFS+, SMEI, SMEB, ICEGTCS, or familial migraines (Ceulemans 
et ah, 2004; Claes et ah, 2001; Escayg et ah, 2000; MuUey et ah, 2005; Wal¬ 
lace et ah, 2001b); in rare cases of familial migraine (Dichgans et ah, 2005), 
LGIl gene mutations are responsible for ADLTE or ADPEAF (Kalachikov 
et ah, 2002; Morante-Redolat et ah, 2002), and mutations in the GABA 
gene lead to genetic (idiopathic) generalized epilepsy (GGE or IGE), infan¬ 
tile spams (IS), autosomal dominant juvenile myoclonic epilepsy (ADJME), 
childhood absence epilepsy (CAE), Lennox-Gastaut syndrome (LGS), 
Dravet syndrome, febrile seizures or GEFS+ (Allen et ah, 2013; Audenaert 
et ah, 2006; Baulac et ah, 2001; CarviU et ah, 2013; Cossette et ah, 2002; 
Delahanty et ah, 2011; Dibbens et ah, 2004, 2009; Harkin et ah, 2002; Hi- 
rose, 2014; Ishii et ah, 2014; Johnston et ah, 2014; Kananura et ah, 2002; 
Lachance-Touchette et ah, 2010,2011; Maljevic et ah, 2006; Shi et ah, 2010; 
Sun et ah, 2008;Tanaka et ah, 2008;Tian et ah, 2013; Urak et ah, 2006;Wal¬ 
lace et ah, 2001a). Interestingly, different mutation types are predominant, 
according to the category of epilepsy. For example, GEFS+ are often caused 
by missense mutations (leading to the replacement of one amino acid by 
another one), while SMEI is mainly due to nonsense mutations or frame- 
shift mutations in the SCNIA gene (Ceulemans et ah, 2004). The same as 
for the MECP2 gene, mutations in SCNIA mainly arise de novo, unlike in 
the LGIl gene, in which de novo mutations represent only 2% of mutations 
(BisuUi et ah, 2004; Michelucci et ah, 2007). 


96 


Nonsense Mutation Correction in Human Diseases 


All the previous data indicate how complex it is to treat one pathology 
with a unique treatment, rather than to treat the molecular problem at the 
origin of the pathology. With the development of targeted medicine, we 
might assist to an inversion of that way of thinking: a treatment wiU apply 
to patients having the same molecular dysfunction, and affected by different 
diseases.The therapeutic unit wiU not be the pathology any longer, but the 
molecular event at the origin of the disorder. It is one of these new-targeted 
therapies that we are going to explore in the next chapters that concern the 
correction of nonsense mutations. 
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Nonsense mutations representing about 10% of the mutations responsible 
for genetic related pathologies (Mort et al., 2008) attracted the attention 
of researchers and industries to specifically correct them. Various strategies 
have been developed in order to correct nonsense mutations. Since non¬ 
sense mutations are not purely patient specific and are found in various 
pathologies, strategies to correct nonsense mutations represent more an 
example of targeted therapies than a case of personalized medicine. Some 
of these strategies are nonsense mutation-specific, and some can apply to 
different categories of mutations. Finally, some will affect the genomic 
DNA and some will target the RNA only, excluding the ethical question 
of heredity. Overall, aU demonstrate some interesting properties to correct 
nonsense mutations, but all have weaknesses. A nonexhaustive list of the 
strategies that can potentially be used to correct nonsense mutations as well 
as some combinations of strategies are presented in this text. 


1 THE EXON SKIPPING 
1.1 Principle 

Since the identification of introns (also known as intergenic regions) in 
the second part of the 1970s (Berget et al., 1977; Chow et al., 1977), their 
removal by splicing reaction has been extensively studied. Cis- and trans- 
regulators have been identified and characterized in numerous models, lead¬ 
ing to a better understanding of this mechanism (Wang and Burge, 2008; 
Witten and Ule, 2011). It became therefore possible to interfere with the 
splicing reaction in order to activate or inhibit it. Several strategies have 
been developed to inhibit splicing in order to remove the mutation from 
the niRNA by inducing an exon skipping. Such strategy can therefore apply 
to the correction of nonsense mutation by removing the PTC-containing 
exon from the niRNA. Interestingly, the exon skipping strategy also applies 
to any mutation (insertion, deletion, nonsense, or missense) limited to one 
or several exon(s), since the skipping can target several exons in order to 
fix a mutation that affects several exons or to catch back with the original 
open reading frame (ORF). Indeed, exon skipping opens the possibility 
to restore an ORF after a deletion of one or several exons, for instance. It 
becomes a molecular tool to do a surgery on the mRNA in order to fix the 
consequence of a mutation, without affecting the host genome (Fig. 3.1). 

Historically, scientists attempted to prevent the splice site recognition 
by the spliceosome, using antisense oligonucleotides. The nature of these 
oligonucleotides evolved from RNA molecules, in the beginning, to reach 
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Figure 3.1 Principle of exon skipping. (A) An example of constitutive splicing on a 
pre-mRNA. In that situation, all introns are removed and all exons subsist in the mRNA. 
(B) An example of an exon skipping occurring on a pre-mRNA. The consequence of the 
exon 2 skipping of this pre-mRNA is the ligation of the exon 1 with the exon 3, and the 
absence of the exon 2 from the mRNA. Horizontal lines represent introns and brown 
boxes symbolize exons. Also, 5' and 3' splice site (ss) are indicated. 

peptide nucleic acid (PNA) or other modified nucleic acid, in order to 
improve the strength of the annealing between the oligonucleotide and 
the RNA target, and to prevent the decay of such oligonucleotide mol¬ 
ecules from nuclease activities (Table 3.1) (Aartsma-Rus et ah, 2004; Davis 
et ah, 2009; Egholm et ah, 1993; Goemans et ah, 2011; Karras et ah, 2000; 
Lu et ah, 2005;Murray et al.,2012;Summerton andWeUer, 1997;Takeshima 
et ah, 2006;Wheeler et ah, 2012;Yagi et ah, 2004;Yamada et ah, 2011). 

The idea is to introduce in the cell, a small nucleic acid molecule 
carrying the antisense sequence of a 5' or 3' splice site, in order to prevent 
the recognition of the target sequence by a trans factor. The masked splice 
site will then not be able to be bound by splicing factors, and the spliceo- 
some will ignore it and use another splice site (Fig. 3.2A). 

Another strategy to induce an exon skipping is to tether an inhibitor of 
splicing close to the targeted splice site, in order to prevent its recognition 
by the spliceosome. This approach is more complex than the use of anti- 
sense oligonucleotide because it requires assembling a molecule that will 
recognize a specific RNA sequence and a molecule carrying the splicing 
inhibition activity (Fig. 3.2B). Both strategies present the strong advantage 
of being sequence-specific, meaning that only the targeted gene wiU be 
impacted by the exon skipping, in theory. 






Table 3.1 Different types of antisense oligonucleotides used in exon skipping strategy 

Drug name Molecule name Full name Structure References 
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Table 3.1 Different types of antisense oligonucleotides used in exon skipping strategy (cont.) 


Drug name 

Molecule name 
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Structure 
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Figure 3.2 Exon skipping strategies. (A) A pre-mRNA harboring a PTC in the exon 2 is 
subject to constitutive splicing resulting in the synthesis ofa PTC-containing mRNAthat 
will be degraded by NMD. (B) An antisense oligonucleotide (AS ON) is used to mask 
the 3' ss upstream of the exon 2, preventing its recognition by splicing factors that will 
bind to other 3' ss.The consequence is the exon 2 skipping and the absence of the PTC 
in the resulting mRNA. (C) A splicing inhibitor is tethered to the exon 2, thanks to an 
RNA molecule consisting of a binding sequence for the splicing inhibitor and an anti- 
sense sequence to the exon 2.The consequence is the inhibition of the 3'and 5'splicing 
surrounding the exon 2 and the skipping of this exon resulting in the absence of PTC in 
the mRNA. 


As a therapeutic approach, removing an internal part of a protein 
becomes possible if two parameters are considered: (1) the internally trun¬ 
cated protein part has to be nonessential for the function of the protein and 
(2) the skipped exon(s) should have a size corresponding to a multiple of 
three nucleotides, in order to catch back the original ORF when only one 
exon is planned to be skipped. 
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Figure 3.3 Schematic representation of the protein domain organization of the dystro¬ 
phin. The exons encoding the different domains are indicated at the top. Dystrophin 
can be divided into four domains named N-terminal domain (orange), the rod domain 
containing 24 repeats (purple) and four hinge domains (green), a cysteine rich domain 
(blue), and a C-terminal domain (pink). 




1.2 Examples 

Exon skipping strategy has already reached clinical trials in the case of 
Duchenne muscular dystrophy (DMD). In this pathology, about 75% 
of patients could be treated by exon skipping (Aartsma-Rus et ah, 2003) 
and, in particular, about 16% of patients could be targeted by an exon 51 
skipping therapy. The exon 51 encodes a part of the central domain of the 
dystrophin protein called Rod domain, starting from exon 8 to exon 62 
(Fig. 3.3). Rod domain is formed by 24 repeats similar to a peptide motif 
found in (S-spectrin. Interestingly, about 60% of the rod domain can be 
deleted without severe consequences on the dystrophin function (England 
et ah, 1990). 

Two strategies have been developed to induce the skipping of the exon 
51 or other exons in the rod domain. The first one is to mask the 3' splice 
site of an intron in order to induce the skipping of the following exon(s), 
using antisense oligonucleotide. The second strategy focuses on inhibiting 
the splicing of a particular exon by tethering a splicing inhibitor on this 
exon. In order to achieve this, modified U7 snRNA bound by hnRNPAl 
was designed in order to anneal with a specific sequence of the target exon 
(GoyenvaUe et ah, 2009). Both strategies promoted very encouraging results 
with the synthesis of internally truncated dystrophin protein in cell cul¬ 
ture and animal models, such as mouse or dog (Aartsma-Rus et ah, 2003; 
Barbash et ah, 2013; GoyenvaUe et ah, 2012; Hoogaars et ah, 2012; Vulin 
et al.,2012). 

Based on the positive results ex vivo on ceU culture, as weU as in vivo in 
mouse and dog models, clinical trials were attempted (for the definition of 
the clinical trial phases, see Note 3.1). 
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NOTE 3.1 The Clinical Trial Phases 


Clinical trial phase I: This phase is the first study on human and requires a small 
number of healthy people or patients (between 20 and 80).The toxicity and the 
tolerance of the drug are evaluated during this phase. 

Clinical trial phase ll:The aim of this second phase is to determine the minimal 
efficient dosage. The study is performed on 100-300 voluntary patients who will 
help to demonstrate some therapeutic benefit, and to identify secondary effects. 

Clinical trial phase III: This study measures the efficacy of the drug versus a 
placebo, or a reference treatment. Several hundred to thousand patients are recruited 
at this stage. It is the final step before the authorization to put the drug on the market. 

Clinical trial phase IV: This phase starts after the introduction of the drug on 
the market and allows the identification of secondary effects or toxicity after a 
long period of use. 


mi The number of patients participating in clinical trials can be much 
lower in the case of the development of treatment on rare diseases, due to the 
limited number of patients. 


Several trials were programmed, such as the one supported by 
GlaxoSmithKline (GSK) and Prosensa Therapeutics with the molecule 
Drisapersen, a drug using 2''-0-methyl phosphorothioate antisense oligo¬ 
nucleotide to induce the exon 51 skipping of the dystrophin gene (Clinical 
trial NCT01803412). At the end of the clinical trial phase II, after 24 weeks 
of treatment, patients who received Drisapersen succeeded to walk 35.8 m 
more than patients who received the placebo in the 6-min walk test (6-MWT, 
see Note 3.2) (Butland et al., 1982). Unfortunately, this drug failed at the 
clinical trial phase III, since the rescue of the dystrophin function was not 
significant at the 6-MWT. Another clinical trial, also using antisense approach, 
has been completed up to the clinical phase II, from Sarepta Therapeutics, 
with a drug called Etephrsen that also induces exon skipping of the exon 51 
of the dystrophin gene, using a phosphorodiamidate morphoHno oligomer 
(PMO) (Clinical trial NCT 01396239). Results of this clinical trial indicate 
that patients treated with Etephrsen were capable of walking about 67 m more 
than the control group treated with a placebo (MendeU et al., 2013). Patients 
at the origin of the test were able to walk 200-400 m in the 6-MWT. Interest¬ 
ingly, patients treated with Etephrsen succeeded to hold their original distance 
covered during 6 min after 48 weeks of treatment, while patients treated with 
placebo decreased their performance. This result suggests that the treatment 
was not able to induce an increase in the muscular mass, but prevents the exist¬ 
ing mass to decrease, which is aheady a very encouraging result. 
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NOTE 3.2 The 6-Min Walking Test (6-MWT) 

This test measures the distance that a patient can cover in 6 min of walk without 
physical assistance. The patient should do the exercise as fast as possible but it is 
allowed to slow down or even to stop to rest for a while during the exercise. The 
test has to be done on a flat ground without any obstacles with a length of at 
least 25 m without turns. 

For a healthy person, the expected distance can be measured by the follow¬ 
ing formula: 

d = 218 -t (5.14 X size in centimeters) - (5.32 X age) - (1.8 X weight in kilo¬ 
grams) + [51.31 X gender (1 for men and 0 for women)]. 

As an example, a woman of 45 years of age, 157 cm tall and 50 kg weight, is 
expected to cover about 696 m. 


The exon 51 is not the only exon of the dystrophin eligible for the exon 
skipping strategy. Indeed, the exon 53 has similar characteristic to exon 51. 
A clinical trial phase I sponsored by Nippon Shinyaku Pharmaceuticals that 
started in Jun. 2013 with the drug NS-065/NCNP-01 (NCT02081625) 
was expected to be completed by Mar. 2015, and target the skipping of 
the exon 53.The nature of NS-065/NCNP-01 is a morpholino antisense 
oligonucleotide. 

1.3 Weaknesses 

Although exon skipping is a highly targeted approach, an interaction of 
the used antisense sequence with a similar or close sequence present some¬ 
where else in the genome cannot be excluded. It is possible, therefore, that 
the expression of other genes becomes modified due to the presence of 
the antisense oligonucleotide or the splicing inhibitor. However, the risk 
remains limited, since the length of the targeted sequence is more than 
20 nucleotides.There is a probability of 1/10'^ to find this sequence, which 
means that a specific sequence of 20 nucleotides can be found randomly 
once in a size equivalent of 300 human genomes [the human genome is 
about 3 X 10® base pairs (bp)]. 

The second weak point of this strategy is that it will apply only to spe¬ 
cific mutations occurring in exon(s) whose size is a multiple factor of three 
nucleotides, in order to catch back the original ORE Otherwise several 
exons will have to be skipped to catch back the original ORE In addition, 
the lacking part of the protein has to be nonessential for the function of the 
protein. These two restrictions strongly reduce the number of patients that 
can be targeted by this approach. 
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The third weak point is the control of the number of skipped exons. 
Indeed, by forcing the spHceosome to ignore one splice site, the aim is often to 
direct the use of the most proximal splice site and remove an internal part, as 
small as possible, in order to reduce the impact on the function of the protein. 
Unfortunately, the use of distal splice sites cannot be prevented, generating 
many different niRNA isoforms and, eventually, different protein isoforms. 

In parallel to preventing the use of a splice site, this strategy should be 
coupled to a method that favors the use of the splice site, leading to the syn¬ 
thesis of a functional truncated protein. For example, by tethering a splicing 
activator on the downstream exon of the exon carrying the mutation, it 
might be possible to favor the use of its splice sites (Fig. 3.4). 

The last limitation of this strategy, as a therapeutic approach, is related 
to the construct delivery into a full organism, or into a specific cell type. 
This delivery can be done either as a final product (oligonucleotide, ribo- 
nucleoprotein) or as an expression vector. In addition, the expression has 
to be targeted and stable in time. In the case of antisense oligonucleotides 
strategy, the oligonucleotides have to be provided to the patient constantly, 
since such molecules will be diluted after cell division and are also subject 



Exon 1 Exon 3 IM 


Figure 3.4 Possible improvement of the exon skipping strategy by concomitant use of splic¬ 
ing inhibitor tether on the exon to skip and splicing activators on the neighbored exons. 
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to decay. The way to expose patients, the stability of these molecules in dif¬ 
ferent body fluids, the efFiciency of cell penetration, and the amount neces¬ 
sary to get an effect are the biggest challenges for this approach. 

In the case of the tethering of splicing inhibitors, the use of modified len- 
tivirus or retrovirus allows the integration of a DNA construct in the genome 
of the infected cells, leading to a stable expression. The problem is that the 
integration itself might interfere with the expression of closed genes. In addi¬ 
tion, an exogenous DNA element is brought to the genome of patients, with 
no warranty that this element will not move in the host genome. 

2 ™WS-SPLICING 
2.1 Principle 

Traws-splicing is a splicing reaction between two RNA molecules (Fig. 3.5). 
Basically, the spliceosome uses the 5' splice site from one molecule and the 
branch point, together with the 3' splice site from another molecule to 
ligate two exons from two different molecules. This mechanism occurring 
naturally in eukaryotic cells has been originally identified in trypanosome, 
and then in flatworms (Davis et ah, 1995;Murphy et ah, 1986), before being 
observed in human cells (Flouriot et ah,2002). Thanks to this mechanism, it 
is possible to replace a 5' end (5' ira«5-splicing), a 3' end (3' trans-splicing) or 
an internal part of an mRNA [internal exon replacement (lER)] (Fig. 3.5) 
(Wally et ah, 2012). Trans-splicing approach is particularly adapted to cor¬ 
rect mutations in very long genes for which gene therapy is very challeng¬ 
ing. However, this approach can also be considered for fixing genes of any 
size. Although frans-splicing approach has not yet been included in a clinical 
trial, successful corrections of mutations have been reported from cultured 
cells, and from mouse models. 

The RNA trans-splicing molecule (RTM) is introduced in the cell 
either by transfecting a plasmid carrying the gene that expresses the RTM, 
or by using a virus that will express the RTM. The RTM harbors the exon 
sequence that will replace the original one encoded by the endogenous 
gene, a 5' and/or a 3' splice site whose strength has to be equivalent to, 
or even stronger than, the one carried by the pre-mRNA. For that, the 
sequence of the splice sites carried by the RTM has to be close to the con¬ 
sensus splice site sequences (for 5' ss: GURAGU, where R is for a purine; 
for 3' ss: (C/U)^CAG). The splice site is followed by a partial intronic 
sequence, a spacer and finally an antisense sequence complementary to 
a portion of the intron to be iraws-spliced (Fig. 3.5). This last part of the 




Figure 3.5 The different categories of trans-spiicing. At the top, an example of cis- 
splicing is shown. The second panel from the top explains the 3' frans-splicing in which 
a downstream exon is brought by an RNA frans-splicing molecule (RTM). The splicing 
of the intron 2 is performed between the 5' ss of the pre-mRNA and the 3' ss of the 
RTM. The third panel from the top represents an example of 5' frans-splicing, in which 
an upstream exon is brought by the RTM. The splicing reaction at the intron 1 is done 
between the 5' ss of the RTM and the 3' ss of the intron 1 of the pre-mRNA.The bottom 
panel represents an example of lER for which the splicing of the intron 1 is performed 
between the 5' ss of the intron 1 of the pre-mRNA and the 3' ss of the RTM, and the splic¬ 
ing of the intron 2 is done between the 5'ss of the RTM and the 3' ss of the intron 2 of 
the pre-mRNA. The branch point (BP) of each intronic sequence is symbolized by an "A." 
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construct, called binding domain (BD), is around 100—150 nucleotides long, 
in order to minimize the nonspecific fraws-splicing reaction on another 
pre-mRNA, or at another intron in the same target pre-mRNA, and to 
ensure the efficiency of the fraws-splicing reaction (Puttaraju et ah, 2001; 
Walls et ah, 2008). Finally, in the case of a ?>' fraws-splicing reaction, a poly 
adenylation site is added at the 3' end of the RTM, in order to ensure the 
correct processing of the traws-spliced mRNA. 

2.2 An Example of Trans-Splicing Used as Therapeutic 
Approach for Duchenne Muscular Dystrophy 

Due to the giant size of the dystrophin gene (2.5 Mb), fram-splicing appears 
as an attractive approach to correct mutations affecting this gene and, in 
particular, nonsense mutations. The three types of fraws-splicing have been 
explored on the dystrophin model in vitro, as well as in vivo, since mouse 
models exist for that pathology. In 2013, a study showed that by using 3' trans- 
splicing, it is possible to correct a nonsense mutation present in the exon 23 
of the dystrophin gene found in mdx mouse model (Lorain et ah, 201 3). To 
do that, they used an adeno-associated virus construct carrying an antisense 
sequence complementary to the intron 22, a spacer sequence followed by a 
hemi intron, including a branch point and a 3' splice site, and finishing by a 
cDNA sequence of the exon 23 ligated to the exon 59, generating a func¬ 
tional truncated dystrophin (Fig. 3. 6). The AAV constructs were injected in 
the tibialis anterior muscle for 4 weeks. After this treatment, genomic DNA 
was extracted to verify the integration of the AAV construct. RNAs were 
also extracted to analyze the efficiency of ira«5-splicing; this had reached 
about 30%. Interestingly, the same construct can be used to correct, another 
nonsense mutation located in the exon 53 of the dystrophin gene, in the 
mdx4Cv mouse model (Fig. 3.6). Indeed, several mutations located on dif¬ 
ferent exons can be corrected with the same construct, which increases the 
potential of this approach and especially with the prospect of an industrial 
development. 

The same group of researchers also showed that the nonsense mutation 
present in the exon 23 of the dystrophin gene can be corrected in mdx mice 
by using an lER fraws-splicing strategy (Lorain et ah, 2010). The RTM they 
used started from the 5' end by an antisense sequence complementary to 
the intron 22, a spacer, then a hemi intron containing a branch point and 
a 3' splice site, the exon 24, a 5' splice site and a hemi intron sequence, a 
spacer, an antisense sequence complementary to the intron 23, and finally, 
a polyadenylation signal. Intronic splicing enhancer sequences were added 
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3' trans-splicing 
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Functional truncated dystrophin protein 


Figure 3.6 Correction of nonsense mutation in the dystrophin gene in the exon 23 or 53 
using a unique RTM. The trans-splicing reaction occurred between the 5' ss of the dys¬ 
trophin pre-mRNA and the 3' ss of the RTM. The RTM consists of a binding domain (BD) 
of about 150 nucleotides, the BP symbolized by a "A,"the 3' ss and the exon 23 followed 
by the dystrophin exons from 59 to 79. The polyadenylation site was also included in 
the RTM. The protein encoded by the frans-spliced mRNA is functional, even with the 
internal truncation. 

to each hemi intron sequence in order to improve the efficiency of trans- 
splicing about ninefold, allowing it to reach 45% of fraws-spliced dystrophin 
mRNA. 

2.3 Weaknesses 

The major weak point of this approach is the efficiency of fraws-splicing 
that can reach 45% but may be not sufficient to cure the consequences of 
a mutation. Another challenge is the toxicity observed coming from the 
RTM. Indeed, translation can occur directly on the RTM, and it leads to 
the production of unwanted peptides with a putative risk of toxicity. The 
production of these proteins from the RTM can be very complex to iden¬ 
tify, since ds-splicing can also occur on the RTM, generating more possi¬ 
bilities to create some ORFs (Monjaret et ah, 2014). This approach has not 
yet reached the clinical trial stage due to concerns about its safety, but it still 
represents a very attractive approach to fix any type of mutations, particu¬ 
larly in big genes. 
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Figure 3.7 Principle of the readthrough. In the left panel, when the ribosome reaches 
a stop codon (UAG), the release factor (eRF) enters in the A site of the ribosome to ter¬ 
minate the translation. In the right panel, when the ribosome reaches the stop codon, 
a competition occurs between the release factor and a tRNA that can enter the A site to 
recognize the stop codon under readthrough circumstances. The translation can then 
go on until the next stop codon. 

3 PTC-READTHROUGH 
3.1 Principle 

Readthrough is a natural mechanism that consists in the incorporation of 
an amino acid at the PTC position (Fig. 3.7). It could be considered as 
a mistake done by the translation machinery since the stop codon is not 
recognized by the translation termination machinery. Readthrough of stop 
codons has been found to occur naturally on particular niRNAs and in 
various species. Indeed, about 5% of stop codons are reassigned in Drosophila 
(Jungreis et ah, 2011). In human, the identification of natural readthrough is 
under investigation and only five genes have been clearly demonstrated to 
use readthrough to modulate their expression (SACMIL, OPRKl, OPRLl, 
BRI3BP, and MYELIN PO) and at least three others are predicted to be 
subject to readthrough on their stop codon (ACP2, MAPKIO, andAQP4) 
(Loughran et ah, 2014;Yamaguchi et ah, 2012). An increasing number of 
human genes supporting readthrough on their normal termination codon is 
expected to be identified in a close future, since specific nucleotide context 
has been identified to favor readthrough, and such sequence can be found 
in many genes. As a matter of fact, the 5'’ and the 3' sequences influence the 
rate of readthrough, and the immediate stop codon 3' sequences CUAG or 
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at a more degenerated level, the sequence CARYYA (R for purine and Y 
for pyrimidine) are found around stop codons subject to readthrough, in 
most of cases (Loughran et ah, 2014). Until now, natural readthrough has 
been found to occur on UGA stop codon, likely because this stop codon 
is the leakiest. However, readthrough can also occur on UAG or UAA stop 
codon with a lower efficiency than on the UGA stop codon as demon¬ 
strated in cell culture (Loughran et ah, 2014). 

In the case of a nonsense mutation, readthrough of the PTC represents 
a very interesting therapeutic strategy, since the final product would be the 
synthesis of the fuU-length protein, with a maximum of one amino acid dif¬ 
ferent from the wild-type protein. Indeed, the incorporated amino acid at 
the PTC position could be the same as the one present in the wild-type 
protein, or a different one. This means that, in most of cases, the protein 
will be functional, except if the amino acid incorporated at the PTC posi¬ 
tion is crucial for the function of the protein and not compatible with this 
function. The incorporated amino acid depends on the identity of the stop 
codon that is readthrough. Indeed, a UAA or UAG stop codon will favor the 
incorporation of a glutamine at the PTC position, while a UGA stop codon 
wiU promote the incorporation of an arginine, a tryptophane, or a cysteine 
(Feng et ah, 1990). A recent study in yeast refines the incorporation rate of 
amino acid at the PTC position as follows: there is a 54% chance to incor¬ 
porate a tyrosine, a 44% chance to incorporate a glutamine, and a 2% chance 
to incorporate a lysine at a UAA stop codon; for the UAG stop codon, 
tyrosine will be incorporated in 92% of cases, glutamine in 5% of cases, and 
lysine in 3% of cases. Finally, for the UGA stop codon, tryptophane is the 
major incorporated amino acid (82% of cases), then the cysteine (14-17% 
of cases) followed by arginine (1-4% of cases) (Blanchet et ah, 2014). In the 
absence of a dedicated analysis, it cannot be excluded that the incorporation 
rate of amino acids could be slightly different in other species and could 
depend on the molecule used to activate readthrough. 

In the development of nonsense suppression therapies by readthrough, 
there are two major parameters that need to be taken in account: (1) the 
identity of the stop codon, since, in the case of many molecules activat¬ 
ing readthrough, UGA has been shown to be the easiest stop codon to be 
readthrough, and UAA the most difficult one and (2) the nucleotide context 
of the stop codon. We saw that, in the case of a natural readthrough on the 
physiological stop codon, the CUAG sequence immediately downstream 
of the stop codon favors readthrough (Loughran et al., 2014). The influence 
of the nucleotide context in the vicinity of the PTC has been studied, and it 
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Figure 3.8 Nucleotide sequences promoting efficient readthrough according to the 
stop codon. The number under the mRNA corresponds to the position relative to the 
stop codon. 


was demonstrated that the presence of an uridine immediately upstream of 
the PTC and a cytosine downstream at the position +4 promote the highest 
rate of readthrough induced by gentamicin, making the sequence “U stop C” 
the best candidate for readthrough therapy (Floquet et ah, 2012). How¬ 
ever, the nucleotide context influences differently the readthrough accord¬ 
ing to the identity of the stop codon (Manuvakhova et ah, 2000) (Fig. 3.8). 
Indeed, for UGA and UAA stop codons, a cytosine at the position +4 seems 
to provide the best rate of readthrough by most of aminoglycosides, unlike 
for the UAG stop codon for which an uridine at the position +4 promotes 
the highest rate of readthrough (Manuvakhova et ah, 2000). It is worth not¬ 
ing that all these conclusions were obtained from in vitro or ex vivo analysis 
and the efficiency of readthrough could be very different in vivo, as already 
reported (Manuvakhova et ah, 2000). 

Two types of molecules have been used to induce PTC-readthrough, 
such as suppressor transfer RNAs (tRNAs) or chemical molecules.The first 
category is done by changing the anticodon sequence of a tRNA so that 
it has to recognize one of the stop codon (Fig. 3.9). Since this tRNA still 
carries an amino acid, it will change a stop codon into a sense codon. 

Although suppressor tRNAs have been mainly studied in inferior 
eukaryotes, natural suppressor tRNAs have also been identified in mam¬ 
mals (Kuchino and Muramatsu, 1996). Interestingly, such natural suppressor 
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Figure 3.9 Transformation of a tRNA into a suppressor tRNA by changing the anticodon 
UGA of the tRNA carrying the serine amino acid into the anticodon UUA to recognize the 
stop codon UAA. 


tRNAs are tightly regulated in order to act on specific stop codon and 
to respond to particular stimuli at the accurate moment. In the case of 
selenoproteins, for example, a tRNA carrying a serine is incorporated at 
the position of a UGA stop codon. The readthrough of the UGA stop 
codon is conditioned by the presence of selenium, and is highly regulated 
via a cis sequence on the mRNA encoding the selenoprotein, called SECIS 
(for selenocysteine insertion sequence) that adopts a specific kink-turn sec¬ 
ondary structure, and at least three proteins (the SECIS binding protein 2 
(SBP2), the selenocysteine-specific translation elongation factor (eEFSec), 
and the ribosomal protein L30) (Chavatte et ah, 2005; Copeland et ah, 2000; 
Fagegaltier et ah, 2000;Tujebajeva et ah, 2000). Due to a methylation on 
one uridine from the anticodon (UUG^UmUG), tRNAs carrying glu¬ 
tamine amino acid have also been shown to read the UAG stop codon, in 
addition to the CAA codon (Kuchino and Muramatsu, 1996). By gaining a 
deeper understanding of the regulation of these natural suppressor tRNAs, 
the use of such particular tRNAs could represent an additional therapeutic 
strategy to correct nonsense mutations (Temple et ah, 1982). 

3. 7.7 Aminoglycoside Molecules 

Chemical molecules represent the second type of molecules capable of 
readthrough. Indeed, some particular chemical molecules have been shown to 
improve the efficiency of PTC-readthrough (Table 3.2). Some aminoglyco¬ 
side family members for instance have the capacity to efficiently readthrough 
PTC with no effects on the physiological stop codons. Aminoglycosides are 
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composed by a sugar molecule carrying at least one amino substitution. 
They were identified for their antibiotic property at the origin. Among the 
members of this family, kanamycin, tobramycin, neomycin, or amikacin have 
been studied for several decades. However, gentamicin and geneticin (also 
called G418) have been studied deeply for their capacity of correcting PTCs. 
The efficiency of readthrough is variable, according to the aminoglycoside 
molecule and, in most cases, does not exceed 5% of the wild-type expres¬ 
sion, with molecules such as geneticin or negamycin (Allamand et al.,2008). 
By comparing the different aminoglycoside members for their capacity to 
induce readthrough, it appears that a hydroxyl at the position C6' in the ring 
I is present in aminoglycosides, with efficient readthrough activity suggest¬ 
ing that this chemical group could play an essential role in the mechanism of 
readthrough (Manuvakhova et ah, 2000) (Table 3.2). Aminoglycosides have 
been tested in mouse models and in human, and showed successfully the 
rescue of the expression of PTC-containing genes (Du et al., 2002; Guerin 
et al., 2008; Rowe et al., 2011; Wang et al., 2012; Xue et al., 2014). They 
were also shown to restore successfully the expression of CFTR in human 
clinical trials (Sermet-Gaudelus et al., 2007; Wilschanski et al., 2003) but it 
has been known for a long time that the use of aminoglycosides for a long 
period of time leads to irreversible ototoxicity and reversible nephrotox¬ 
icity (Greenwood, 1959; Hettig and Adcock, 1946; Hinshaw et al., 1946; 
Hock and Anderson, 1995; Swan, 1997;Toubeau et al., 1986). About 25% 
of patients treated with aminoglycosides developed a form of ototoxicity. 
According to the aminoglycoside, the ototoxicity is predominantly cochleo- 
toxic with kanamycin, neomycin, or amikacin or vestibulotoxic with gen¬ 
tamicin or tobramycin (Matz, 1993). The ototoxicity appears few days after 
systemic exposure to aminoglycosides (Heck et al., 1963). Unlike what was 
originally hypothesized, the toxicity of aminoglycosides is unrelated to their 
ability of readthrough nonsense codons, and would rather be related to their 
capacity of inhibiting translation from mitochondrial ribosomes (Shulman 
et al., 2014). Indeed, it has been demonstrated that aminoglycosides interact 
with mitochondrial ribosomes (Qian and Guan, 2009) . 

Concerning the toxic aspect of aminoglycosides, three different ways 
have been explored in order to decrease it and to continue with the 
readthrough therapeutic approach using aminoglycosides or other mole¬ 
cules.The first one is to block the toxicity of aminoglycosides by combining 
their effect with some drugs that decrease the toxicity of aminoglycosides. 
The second is to chemically modify aminoglycoside members in order to 
decrease their toxic effect without affecting their readthrough efficiency. 
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or by increasing it (Nudelman et al., 2006). The third solution is to forget 
about aminoglycosides and look for readthrough molecules unrelated to the 
aminoglycoside family. 

One possibility to decrease the toxicity of aminoglycosides is to do a 
cotreatment with another molecule dedicated to blocking the toxicity of 
aminoglycosides. Aminoglycoside toxicity appears by an increase in inner 
ear cell death by apoptosis and the production of reactive oxygen species 
(ROS). Several combinations of treatment have been explored in order to 
inhibit these consequences. Indeed, antiapoptotic molecules, such as the 
zVAD-fmk, a pan inhibitor of caspases, has been shown to prevent apoptotic 
damages by aminoglycosides in vitro, as well as in vivo (Cheng et al., 2003; 
Forge and Li, 2000; Matsui et al., 2003; Nakagawa et al., 2003; Okuda 
et al., 2005; Williams and Holder, 2000). Blocking the function of kinases 
involved in apoptosis activation, such as c-Jun N-terminal kinase (JNK) 
also shows some efficient results in vitro and in vivo, as well in particular 
under neomycin exposure (Bodmer et al., 2002; Bonny et al., 2001; Matsui 
et al., 2004; Nakamagoe et al., 2010; Sugahara et al., 2006;Wang et al., 2003; 
Ylikoski et al., 2002) . Antiapoptotic treatment can only be a short-term 
solution, due to the risk of inducing a tumorigenesis process when apopto¬ 
sis is inhibited. Unfortunately, aminoglycosides are not metabolized, and can 
be detected in cells for months, suggesting that aminoglycoside toxicity can 
occur even after an antiapoptotic treatment. Therefore, the antiapoptotic 
strategy is likely not the best cotreatment to provide to patients that would 
be exposed to aminoglycosides. 

To neutralize the ROS produced during a treatment with aminogly¬ 
cosides, iron chelators have been tested, such as deferoxamine, 2,3 dihy- 
droxybenzoate, or aspirin (Lecain et al., 2007; Sinswat et al., 2000; Song and 
Schacht, 1996; Wu et al., 2001). The latter has been tested in clinical trials 
and shows a real efficiency in the decrease of the ototoxicity of aminogly¬ 
cosides (Behnoud et al., 2009; Chen et al., 2007; Sha et al., 2006) and could 
represent a practical solution, since aspirin is a very weU-known drug and 
is very well tolerated. 

In parallel to trying to decrease the toxicity of aminoglycosides by a 
cotreatment with other molecules, some laboratories synthesized several new 
aminoglycoside-derivative molecules, and tested their readthrough efficiency 
and toxicity. Among them, three molecules named NB54, NB84, and NB124, 
show very promising results ex vivo, in culture cells, as well as in vivo, in 
mouse models (Rowe et al., 2011;Wang et al, 2012; Xue et al., 2014). These 
molecules are more efficient than gentamicin or geneticin, and less toxic than 
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the original aminoglycosides. Thanks to these structure-function studies, an 
optimization of aminoglycosides is occurring, and more molecules become 
available that will hopefully reach the clinical phase trials. 

The last option to solve the issue of aminoglycoside toxicity is to look 
for molecules that are not related to aminoglycosides and are capable of 
rescuing the expression of PTC-containing genes with high readthrough 
efficiency, and with a low toxicity (see chapter 3 section 3.1.2). 

3.7.2 Nonaminoglycoside Molecules 

Due to the toxicity of aminoglycosides, screenings have been developed in 
order to identify various molecules with the capacity of PTC-readthrough, 
hoping to find molecules with no or low toxicity (Table 3.2). In 2007, the 
result of a screening by the company PTC therapeutics has been reported, 
and at least one molecule has been identified. It was initially named PTC 124 
and is now known as ataluren (Welch et ah, 2007). Like aminoglycosides, 
PTC124 can readthrough UGA codons more efficiently, then UAG, and 
modestly, UAA stop codons.The efficiency of PTC124 was demonstrated 
ex vivo by incubating cells in culture harboring a nonsense mutation with 
PTC124 or DMSO as a control. Readthrough activity was monitored by 
western-blot, by measuring the level of the full-length protein encoded 
by the PTC-containing gene. The full-length protein was clearly observed 
in the presence of PTC 124, validating the results of the screening assay. 
These results were confirmed in vivo by exposing MDX mice harboring a 
nonsense mutation in the gene encoding the dystrophin. Authors followed 
the physical capacities of mice, and compared mice exposed to PTC 124, 
and mice exposed to the mock buffer. Clearly, mice exposed to PTC 124 
showed an increased physical activity, and the expression of dystrophin 
in cells was monitored by immunofluorescence, using an antibody raised 
against dystrophin, and coupled with a fluorophore. Since this original 
study, PTC124 has been successfully tested on different mouse models, 
such as cystic fibrosis (Du et ah, 2008), Usher syndrome (Goldmann 
et ah, 2011, 2012), DMD (Kayali et ah, 2012), aniridia (Gregory-Evans 
et ah, 2014), infantile neuronal ceroid lipofuscinosis (Miller et ah, 2015), 
or isolated proximal renal tubular acidosis (Fang et ah, 2015). Consistent 
with these in vivo results, clinical trials were then started with encouraging 
results. The molecule is very well tolerated by patients and some improve¬ 
ments were observed in patients with DMD or cystic fibrosis. For instance, 
patients with DMD included in the clinical phase Ilb, and receiving a low 
dosage of ataluren (10 mg/kg in the morning, 10 mg/kg at midday, and 


Strategies to Correct Nonsense Mutations 135 


20 mg/kg in the evening) were able to walk 13 m less after 48 weeks of 
treatment at the 6-MWT than at the beginning of the clinical trial, when 
they could walk 360 m. Patients receiving the placebo or a high dosage of 
ataluren (20 mg/kg in the morning, 20 mg/kg at midday, and 40 mg/kg in 
the evening) lost about 42 m of walking distance at the 6-MWT, at the end 
of the clinical trial (Haas et ah, 2015). These results demonstrate that atal¬ 
uren slows down the loss of the physical capacity ofDMD patients.That is 
already a very encouraging result for patients and clinicians, since not treat¬ 
ment to cure DMD is available yet. In the case of cystic fibrosis, positive 
results in human were quickly obtained with the rescue of the expression 
of CFTR in patients harboring a nonsense mutation in this gene (Sermet- 
Gaudelus et ah, 2010), and the results of the clinical phase III were released 
in 2014 (Keren! et ah, 2014). After 48 weeks, ataluren improved the forced 
expiratory volume in 1 s (FEVl) of about 5.7%, compared to patients tak¬ 
ing the placebo. These results showed an encouraging improvement of the 
pulmonary function, confirmed by a decrease in the number of exacerba¬ 
tions for the group of patients taking ataluren. This study also highlighted 
interference between ataluren and tobramycin, an aminoglycoside used as 
antibiotic, in particular in pulmonary infection, as it is the case for cystic 
fibrosis patients. Ataluren is the first molecule dedicated to correct non¬ 
sense mutations by readthrough that reaches clinical phase III, providing a 
real hope for patients with genetic diseases caused by nonsense mutations. 
The results of clinical trials remain modest, but they are encouraging, and 
especially because no other drugs with a higher efficacy and no toxicity 
are available yet. 

The clinical results of ataluren stimulated the scientific and medical com¬ 
munity to find other molecules capable of correcting nonsense mutations 
efficiently. To date, several other nonaminoglycosides readthrough mol¬ 
ecules have been identified and published. Among them, two molecules 
called readthrough compounds (RTC) 13 and 14 have been shown to 
induce readthrough very efficiently on ATM, or dystrophin gene harboring 
a nonsense mutation (Du et ah, 2009). The efficiency of both compounds 
was compared to gentamicin or geneticin, and showed a very comparable 
effect on nonsense mutation readthrough. In addition, the comparison with 
ataluren shows that RTC13 or RTC14 can be even more efficient than ata¬ 
luren on some of the tested models (Kayali et al.,2012;Kuschal et ah, 2013). 
The in vivo demonstration of the efficacy of RTC13, and the absence of 
toxicity observed in mouse suggests that this compound could represent 
another possibility for patients with a nonsense mutation-related disease. 
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3.2 Weaknesses 

PTC readthrough is a very attractive approach to correct nonsense mutations 
because the result is the synthesis of a full-length protein that has a comfortable 
chance of being functional, except if the mutant position is essential for the 
function of the protein and the amino acid incorporated at the PTC position 
is not compatible with this function. All the quantitative assays focusing on 
readthrough efficiency reported no more than 5% production of readthrough 
protein, compared to the level of the wild-type protein in nonmutant cells 
(Bidou et ah, 2004; Kayali et al., 2012;Wang et ah, 2012). For some patholo¬ 
gies, such as cystic fibrosis, such a low functional protein is enough to decrease 
strongly at least the pathological phenotype (RamaUio et al., 2002) but, for 
most of the diseases, a higher rate of readthrough proteins is necessary in order 
to restore the cellular function of the mutant protein. 

Two main reasons can explain why PTC readthrough has limited elFicacy. 
First, PTC readthrough is a mechanism by which the translation machinery 
is forced to introduce a mistake. The translation process has a high fidelity 
rate of about 1 error in 10,000 amino acids (Ibba and Soil, 1999; Loftfield 
and Vanderjagt, 1972); thanks to quality control mechanisms that protect 
cells from synthesizing aberrant proteins. The readthrough mechanism has 
to challenge these translation quality controls, which is a very difficult task 
due to the requirement of having a high fidelity in translation. The second 
reason is that mRNA candidates for PTC readthrough are first substrates 
for NMD, making the amount of mRNAs available for readthrough inexis- 
tent or reduced to a very low amount. 

Another potential limitation in the use of readthrough as a therapeutic 
approach is the risk of inducing a readthrough on the normal termination 
codon. As we saw at the beginning of this chapter, readthrough can occur on 
normal termination codons (Loughran et al., 2014;Yamaguchi et al., 2012). 
However, readthrough on normal termination codons happens on specific 
genes that have likely developed a dedicated environment to facilitate the 
readthrough. The nucleotide sequence around the stop codon has evolved to 
favor the readthrough and even though it has not been reported yet, it is 
expected that some factors are involved in order to regulate the readthrough 
process on those specific genes. Genes that have been shown to support 
readthrough on their normal termination codon should be now monitored 
when readthrough strategy is explored, in order to be able to evaluate some 
putative side effects. 

A general readthrough on all normal termination codon is unlikely. 
Several lines of evidence can explain why readthrough likely occurs on 
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PTC, and not on most of normal stop codons. The nucleotide context 
between a PTC and a physiological stop codon is different. Translation 
termination has been selected during evolution to occur on the physiologi¬ 
cal stop codon - that is not the case for a PTC. Indeed, it is thought that 
the 3' UTR carries some af-regulator elements, such as specific secondary 
structures or specific binding sites, and some trans regulator elements such 
as proteins to help the translation termination. For example, the protein 
TPAl has been shown to interact with the poly(A) binding protein, and 
the release factors 1 and 3 bridging the 3' UTR with the translation termi¬ 
nation complex (Keeling et ah, 2006). The close neighborhood of a PTC 
likely does not support the same proteins. Recently, the methylation of the 
nucleotide has been shown to be increased in the 3' UTR, compared to the 
coding sequence, suggesting that a PTC would be surrounded by a poorly 
methylated nucleotide area, unlike the physiological stop codon (Meyer 
et ah, 2012). However, the methylation of the PTC area has to be clearly 
studied in order to exclude the possibility that the methylation status of a 
PTC-nucleotide environment changes quickly after the introduction of the 
PTC, in order to become similar to the one found at the normal termina¬ 
tion codon. However, prudence has to be maintained, since it has been 
recently reported that many peptides coming from 3' UTR, that is, the 
readthrough of physiological termination codons, are found on cell surface 
after gentamicin treatment of cells. This suggests that we might have over¬ 
estimated the efficiency of translation termination at the physiological stop 
codons (Goodenough et ah, 2014). 

Increasing the efficiency of readthrough by using highly efficient mol¬ 
ecules might be dangerous, because the natural fidelity of the translation 
process could be impacted on, and would generate the synthesis of harm¬ 
ful readthrough proteins abnormally longer, due to the readthrough of the 
physiological stop codon. In order to increase the amount of readthrough 
protein without increasing the readthrough on the physiological stop 
codon, another way has been developed, based on the fact that one rea¬ 
son explaining the low efficiency of readthrough is that PTC-containing 
niRNAs are first substrates for NMD,before being substrates for readthrough. 
Therefore, no or a very low amount of RNA is available to become sub¬ 
strate for readthrough. Inhibiting NMD in combination to the activation 
of readthrough might represent an easier (because molecules have already 
been identified) and safer way (since the translation process will be mildly 
affected). The inhibition of NMD will be described in the next chapter and 
the combination of both strategies will be discussed in Section 9. 
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4 NMD INHIBITION 

4.1 Principle 

PTC-containing mRNAs are degraded by NMD, as long as the PTC fol¬ 
lows the 50-55 nucleotides rule, and/or the distance between the PABPCl 
protein and the PTC position. The consequences of NMD are the silencing 
of the mutant gene, and generally not the synthesis of a truncated protein. 
NMD prevents the synthesis of a harmful or a nonfunctional truncated pro¬ 
tein (Bhuvanagiri et al., 2010; Brogna and Wen, 2009; Chang et al., 2007; 
Kervestin and Jacobson, 2012; Popp and Maquat, 2014; Rebbapragada and 
Lykke-Andersen, 2009) . However, the truncated protein that would be syn¬ 
thesized without NMD would sometimes keep a residual or the full function 
of the wild-type protein. Inhibiting NMD then becomes an obvious thera¬ 
peutic strategy to correct the consequences of some nonsense mutations. 

4.2 Examples 

Although NMD inhibition has not yet been evaluated for therapeu¬ 
tic approach, many molecules have been shown to inhibit this process 
(Table 3.3). Among these NMD inhibitors are found translation inhibitors, 
such as cycloheximide, emetine, or puromycin, due to the fact that NMD 
requires translation (Shoemaker and Green, 2012). These molecules are an 
indirect way to block NMD and would be difficult to use for therapeutic 
approaches, since they are primarily translation inhibitors. 

The first molecules capable of inhibiting NMD and not directly linked 
to translation inhibition were reported in 2004 (Usuki et al., 2004). In this 
study, the authors showed that wortmannin or caffeine can inhibit NMD 
by blocking the function of SMGl, the kinase that phosphorylates UPFl. 
UPFl becoming hypophosphorylated, the processing of NMD is stopped 
and PTC-containing mRNAs are stabilized. Both molecules were tested on 
UUrich disease patient cells carrying a frameshift mutation of the coUagen 
VI-a2 gene resulting in the introduction of a PTC in the exon 22 that 
elicits NMD. The incubation of these cells with 10 |xM of wortmannin or 
7.5 niM of caffeine leads to two- to threefold increase in collagen VI-a2 
mRNA. Under these treatments, the truncated protein was also detected, 
indicating that the stabilized PTC-containing mRNAs support translation. 

In 2007, another molecule has been identified as an NMD inhibitor with 
no other interfering properties .This molecule called NMD inhibitor 1 (NMDI 
1) is an indole derivative molecule.This molecule has been successfully tested 
in cell cultures, and in vivo in a mouse model harboring a nonsense mutation 
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Table 3.3 Main NMD inhibitors identified to date 

Drug name Structure References 



(Continued) 
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Table 3.3 Main NMD inhibitors identified to date (cont.) 

Drug name Structure References 



in the gene encoding the a-L-iduronidase, and showed a two- to threefold 
increase in the stabilization of PTC-containing genes (Durand et al., 2007; 
Keeling et al., 2013). After the demonstration of the in vivo effectiveness of 
NMDI 1, that molecule could represent a good candidate for incorporation 
in therapeutic approaches for genetic diseases caused by a PTC, either alone, 
or in a combination with a readthrough activator molecule, as we will see in 
Section 9. The mode of action of NMDI 1 has been studied and it showed 
that NMDI 1 interferes with the interaction between UPFl and SMG5, lead¬ 
ing to the accumulation of hyperphosphorylated UPFl and the inhibition of 
NMD (Durand et al., 2007;Yamashita et ah, 2001). 

In 2009, Pateamine A, the natural compound isolated from a marine 
sponge, has been shown to inhibit NMD about two- to threefold (Dang 
et ah, 2009). Pateamine A interacts with eIF4AI and II to inhibit transla¬ 
tion and interacts also with eIF4AIII to inhibit NMD, independently from 
the translation inhibition. The protein eIF4AIII is a core component of the 
exon junction complex (EJC) known to mark the exon—exon junctions 
after splicing (Le Hir et ah, 2000a,b) (see chapter: General Aspects Related 
to Nonsense Mutations; Section 3.3). Pateamine A stabilizes the interaction 
between UPFl and UPF3X (UPF3b), interfering with the dynamic of the 
NMD process (Dang et ah, 2009). 

Using a screening dedicated to the identification of NMD inhibitors, 
a drug available on the market has been identified as an NMD inhibitor 
(Gonzalez-Hilarion et ah, 2012). This molecule called amlexanox is used 
for the treatment of aphthous ulcers, and for the treatment of some forms 
of asthma (Bailey et ah, 2011; Bell, 2005; Gonsalves et ah, 2007). Amlexanox 
has been shown to inhibit NMD on several cell lines harboring a nonsense 
mutation in p53, dystrophin, or CFTR gene, since nonsense mutation- 
containing mRNAs were stabilized two- to fourfold. Because amlexanox is 
already a drug provided to patients for more than 30 years, there is abundant 
information relative to its biodistribution and toxicity (Bell, 2005) . Overall, 
one of the positive information is to conclude that an NMD inhibitor can 
be well tolerated by patients and can be safe. 
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More recently, a new set of NMD inhibitors have been reported based on 
a screen to identify molecules capable of preventing the interaction between 
UPFl and SMG7 (Martin et ah, 2014). For that, the authors screened 
molecules in silico, based on their structure, in order to identify the ones that 
could insert in a SMG7 pocket involved in SMG7-UPF1 interaction. The 
strong advantage of this approach is to select NMD inhibitors based on a 
specific mode of action. Ten compounds have been identified and validated 
using this virtual screen. Among these compounds, NMDI 14 stabilizes 
PTC-containing globin niRNA up to sevenfold, and is active at nanomo¬ 
lar concentrations with a low cellular toxicity, making this molecule a very 
powerful NMD inhibitor. 

4.3 Weaknesses 

NMD inhibition is a nonspecific approach since all NMD reactions will be 
affected by this approach. In particular, all natural substrates of NMD that 
are normally degraded by NMD might be stabilized with deleterious conse¬ 
quences, if some niRNAs encode toxic proteins. However, according to the 
efficiency of the NMD inhibition, natural substrates are not always stabilized, 
while nonsense mutation-containing niRNAs are. For example, ex vivo 
inhibition of NMD by NMDI 1 or amlexanox did not lead to the upregula- 
tion of genes using NMD as a regulation pathway, unlike apoptosis activa¬ 
tors that have been recently shown to inhibit NMD (Durand et ah, 2007; 
Gonzalez-Hilarion et ah, 2012;Jia et al, 2015; Popp and Maquat, 2015). It 
is likely the NMD inhibition efficiency that can explain why some NMD 
inhibitors affect the natural substrates of NMD expression and why some 
others do not. That means that genes harboring a nonsense mutation are 
more sensitive to a variation in NMD efficiency than mRNAs using NMD 
for their regulation. A hypothesis could be that the nucleotide environment 
of the PTC found on natural substrates of NMD has been optimized during 
evolution to be detected by the NMD machinery. This is not the case for 
the nucleotide environment around a nonsense mutation. This means that 
the choice of NMD inhibitor is determinant, and as for the readthrough 
approach, the most efficient might not be the best choice for therapy. 


5 PSEUDOURIDYLATION ATTHEPTC 
5.1 Principle 

Pseudouridylation is a natural modification occurring on uridine base 
in order to convert it into a pseudouridine (named 5-ribosyluracil, 'P) 
(Fig. 3.10). Different classes of RNAs are subject to pseudouridylation, such 
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Figure 3.10 Structural representation of uridine and pseudouridine CV). Arrows sym¬ 
bolize the acceptor (a) or donor (d) hydrogen bonds established with a complementary 
nucleotide. 

as tRNAs, ribosomal RNAs (rRNAs), and small nuclear RNAs (snRNAs). 
Pseudouridine has different biochemical properties, compared to uridine; 
and, in particular, the number of hydrogen bonds that can be established 
with a complementary nucleotide is different. 

The pseudouridylation is endorsed by an H/ACA RNP consisting in a 
dimer of four proteins named Nhp2p, NoplOp, Garlp, and CbfBp, in yeast, and 
an RNA guide harboring a box H and a box AC A (Fig. 3.11). Interestingly, 
by modifying the sequence of the RNA guide part, it is possible to decide 
the uridine to be modified (Huang et al., 2011, 2012). Since aU stop codons 
start by an uridine, they can potentially be subject to pseudouridylation. The 
conversion of a premature termination codon into a sense codon using pseu¬ 
douridylation of the first nucleotide of the stop codon has been successfully 
attempted, in vitro and in vivo in yeast, at least (KarijoHch and Yu, 2011). 
The analysis of the amino acids introduced at the modified stop codon 
were predominantly the serine or the threonine at the UAA^T'AA or the 
UAG^T'AG targeted stop codon, and the tyrosine or the phenylalanine at the 
UGA^T'GA targeted stop codon. This approach is very interesting because 
by modifying a specific stop codon artificially, it is possible to take advantage of 
an existing mechanism that will now see a stop codon as a sense codon. 
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Figure 3.11 Ribonucleoprotein complex in charge of the pseudouridylation. 


5.2 Weaknesses 

As for any targeted method based on the recognition of a specific sequence 
by annealing, there is always a possibility to anneal at a nonspecific place in 
the genome. Another limitation in the development of pseudouridylation on 
a PTC, as a therapeutic approach, is the requirement to introduce in cells a 
gene expressing the RNA guide or the delivery of an RNA molecule to the 
cells. The development of this approach is, therefore, related to the develop¬ 
ment of gene therapy (see Section 6). Finally, the small number of amino acids 
that can be inserted at the PTC position might reduce the chance of getting 
functional proteins. However, this limited number of amino acids that can be 
inserted at the PTC position generates a more homogenous population of 
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readthrough proteins, than with chemical readthrough molecules (Section 3). 
Pseudouridylation at PTC is definitively a very attractive approach to induc¬ 
ing readthrough of PTC once the gene delivery barrier will be solved. 


6 GENETHERAPY 
6.1 Principle 

Gene therapy aims to replace a mutant gene, or a fraction of the gene, by a 
wild-type version. This approach started to be explored in the 1980s, using 
virus such as adenovirus to carry and to introduce the trans-/ af-gene (a tram- 
gene is a gene coming from another species, while the af-gene is a gene 
coming from the same species) into human cells (Rosenberg et ah, 1990). 
This approach can apply to any gene mutations and, among them, to non¬ 
sense mutations, even though it has never been developed for this specific 
purpose. Since 1989, more than 2000 clinical trials have been developed for 
different diseases, and mainly for cancer-related pathologies (64%). 

Since gene therapy occurs at the DNA level, it raises the question of 
doing it at the somatic level, or at the germline level (Freire et ah, 2014). 
Ethical consideration has to be clearly solved when there is question of 
modifying the genome of the next generations of humans but, on the other 
hand, gene therapy represents a definitive answer for genetic pathologies. 
Once the gene has been replaced, no more treatment is required. 

The main challenge to this approach is the delivery of the wild-type 
gene version into the cells of a host organism. Recombinant viruses have 
been developed and, among them, adenovirus, retrovirus, or vaccinia virus, 
for the most used (Niederer and Bangham, 2014). Using viruses allows tak¬ 
ing advantage of their systems of delivering a trans-gene into the nucleus of 
a host cell. Some viruses are episomal, so they do not integrate their genome 
into the host DNA, such as adenovirus, while others integrate their genome 
in the host DNA, like retrovirus. The consequences can be very different 
because it is expected that the trans-gene brought by adenovirus will be 
diluted along the multiple cell divisions, while the trans-gene brought by 
retrovirus should be kept in cells and transmitted to daughter cells. However, 
the integration of virus genome is not stable and it can move from one posi¬ 
tion to another in the host genome, with the possibility of taking some 
surrounding DNA at each transfer, raising the risk to inhibit its expression. 
Similarly, on each integration of the virus genome, the expression of the 
genes in the vicinity might be disturbed, since the virus genome carries 
some transcription regulator elements that could increase abnormally some 
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host gene or, in contrast, the integration of the virus genome could destroy 
a transcriptional element for the host genes. 

This therapeutic approach could be used to treat some specific tissues, 
due to the selectivity of viruses. Indeed, by using specific viruses, it is pos¬ 
sible to target preferentially some cell types. For example, adenoviruses have 
been used to target liver cells (Connelly, 1999), while Herpes simplex virus 
(HSV) will primarily infect neuron cells (Jacobs et ah, 1999). It is definitively 
a huge advantage of this approach, since the targeting can be very specific 
and the efficiency of introducing a foreign DNA molecule into human cells 
is very high. 

6.2 Weaknesses 

Although this approach is very attractive, because it could treat any types of 
mutations occurring in a gene, several hmitations have been highlighted. The 
use of virus to introduce the trans-/ af-gene, and its integration in the genome 
of the host cells, raise a question about the consequences of this integration, 
and the stability of the exonucleic-acid sequence to stay at the integration 
point. The integration of the genomic sequence of the virus could interfere 
with the expression of one or several genes by inserting in gene sequences, 
or by interrupting some regulator elements, such as promoter sequences.The 
risk associated with the instability of the integration is to see a part of the host 
genomic sequence being taken away with the genome of the virus. Such event 
could be at the origin oftumorigenesis,for example, but not only; it could also 
be at the origin of another genetic disorder by interfering with the expres¬ 
sion of a gene related with a pathology. The next challenge will be to target 
the integration of the trans-/cis-gene at a specific neutral locus. Some studies 
already focus on the targeting of the exogene at a specific position in the 
host genome, using adenovirus or retrovirus-derivative vectors (Niederer and 
Bangham, 2014). Although such approach is stiU fighting with the exclusive 
integration to the predicted site in the genome, once that issue will be solved, 
this approach might become the new way to treat any genetic diseases, due 
to the fact that once the gene is introduced, no more treatment will be neces¬ 
sary, thus decreasing the risk of side effects linked to a long time treatment. 

7 CELL THERAPY 
7.1 Principle 

The injection of cells inside an organism in order to correct a gene expres¬ 
sion or to bring a new function is called cell therapy (Fig. 3.12). With the 
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Therapeutic cells 



Figure 3.12 Principle of cell therapy. Cells are collected from the patient or from a 
healthy compatible donor and then cultured in vitro to generate IPS or isolate stem 
cells. At that step, gene therapy or genome editing could also be applied to correct a 
gene mutation. Finally, the cells can be re-implanted into the patient, where they will be 
at the origin of a wild-type cell lineage. 


emergence of stem cell and induced pluripotent stem (IPS) cell research, cell 
therapy becomes a very promising therapeutic axis. In the case of nonsense 
mutation correction, cell therapy would consist in injecting cells with the 
wild-type version of the mutated gene in the patient. Such approach has 
not yet been evaluated for the correction of nonsense mutation specifically, 
but it is adapted in theory. Thanks to the stem cell studies, it is possible to 
introduce wild-type progenitor cells from several cell lineages that, under 
external stimuli, will be able to differentiate into one specific lineage, and to 
maintain the undifferentiated stock of cells. The consequences are a mosaic 
expression of the targeted gene, with a mutant expression from the original 
patient cells, and a wild-type expression from cells derived from the injected 
stem cells. The earlier the cell injection takes places, the more wild-type 
cells will be present in the adult organism. 
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Cell therapy always raised ethical questions, and in particular when 
there appears a question about acting at the embryogenesis step in order to 
prevent a pathology in a future newborn from affected or carrier parents. 
However, in the case of pathologies affecting exclusively a cell type, such as 
leukemia or Crohn diseases, for example, cell therapy represents an excep¬ 
tional therapeutic hope. In particular, IPS cells fit very well such application, 
since they are pluripotent and not totipotent, meaning they can only dif¬ 
ferentiate into several cell lineages, but not all. In addition, their culture is 
now well understood, making their use more accessible. 

One of the actual strategies is to use the stem cells from the patient in 
order to correct the mutation in these cells by gene therapy or genome edit¬ 
ing, and to inject them back into the patient (see Sections 6 and 8). These 
cells will not induce immune response since they come from the patients 
themselves, and they will provide the wild-type version of the gene to aU 
the daughter cells.To correct the mutation in stem cells or IPS, gene therapy 
or genome editing approach can be used, illustrating the requirement to 
combine therapeutic approaches in order to improve future treatments (see 
Sections 6 and 8). 

Several types of stem cells exist and can be isolated, thanks to specific 
markers that are dependent on the cell lineages that the stem cell will be 
able to generate. For instance, CD 133 is a surface protein expressed in neu¬ 
rons, and can be used as a neural stem cell marker (Sanai et ah, 2005). By 
selecting cells according to a set of markers found in different cell types 
derived from a common stem cell, it is possible to isolate these stem cells. 
However, the stem cell population is very low in a tissue, and can reach 
up to about 1/100,000. It is, therefore, a big challenge to isolate stem cells 
from a patient, culture them to expand the population, and then correct the 
mutant gene before injecting the stem cells back into the patient. 

In order to solve the issue of low stem cell population, induced pluripo¬ 
tent stem cells (iPSC) have been developed.To obtain these cells, fibroblasts 
from a patient, for example, are dedifferentiated in order to lose the maxi¬ 
mum of their markers of differentiation. These cells, called iPSC, are then 
capable of replicating and/or entering in a new differentiation way, depend¬ 
ing on the cell culture medium components and, in particular, the content 
in growth hormones (Takahashi et ah, 2007;Takahashi andYamanaka, 2006). 
To induce the loss of differentiation, cells are transfected or infected, in 
order to express at least four specific transcription factors, such as C-MYC, 
SOX2, KLF4, and OCT3/4 (Takahashi et ah, 2007), or other combinations 
of genes, such as OCT4, SOX2, NANOG, and the translation activator 
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Lin28 (Yu et al., 2007). The ultimate goal would be to be able to collect 
some fibroblasts from a skin biopsy, dedifferentiate them in order to differ¬ 
entiate them into neurons, muscle cells, or liver cells, for example. 

7.2 Weaknesses 

Stem cells have been found in all tissues, indicating that any cell type can be 
produced from these progenitor cells. The idea of correcting the mutation 
in these cells before reinjecting them in the patient and letting the differ¬ 
entiation occurs under the control of the patients body is very attractive. 
The main risk of this strategy happens at the step where the mutation is 
corrected in the progenitor cells. For that step, the risks are the same as for 
the gene therapy or genome editing (Sections 6 and 8) and are related to the 
genomic DNA modification of these cells. Since cell therapy can modify the 
germline, ethical considerations have to be solved before going further in 
the development of this strategy. The putative risk of tumorigenesis and the 
ethical reasons explain why cell therapy just started some clinical phase trials. 
However, step-by-step cell therapy becomes safer with, for example, the use 
of alternative vectors to introduce the reprogramming factors in a safer way. 
Indeed, retrovirus or lentivirus vectors used at the origin of cell therapy are 
progressively replaced by adenovirus vectors, limiting the integration in the 
host genome (Stadtfeld et ah, 2008) or the Sendai virus, since this RNA virus 
replicates in the cytoplasm, excluding any integration in the host genome 
(Fusaki et ah, 2009). 


8 GENOME EDITING 

Tools to modify a targeted DNA sequence in a genome have recently emerged, 
and their capacities are promising for the editing of genomes ex vivo. With 
the development of these tools, we enter the age of molecular surgery. Several 
categories of editing systems with a similar mode of action have been identi¬ 
fied. This mode of action always starts by a cleavage of a specific DNA mol¬ 
ecule, and then involves the DNA repair machinery to fix the DNA break. 
The DNA repair mechanisms that are activated are either the homologous 
recombination or the nonhomologous end joining (NHEJ) (Fig. 3.13). The 
first mechanism is very useful when an exosequence needs to be introduced 
at the cleavage site, while the second mechanism will be of interest in order 
to inactivate a gene by inducing deletions and frameshift mutations. 

Mammalian cells cannot easily repair a double strand break. When dou¬ 
ble strand break occurs, cells have two methods to solve this issue. The first 
one, called homologous recombination, requires a template that will be 
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Figure 3.13 Schematic representation of the homoiogous recombination repair (HRR) 
(ieft panei) and the NHEJ repair (rightpanei). HRR involves the replacement of the DNA 
break area by a donor sequence identical to the region surrounding the DNA break. HRR 
allows to repair the broken DNA molecule identically as the original one, or to introduce 
an exogenous sequence at the break position. NHEJ involves first some DNases that 
degrade DNA from the extremities generated by the DNA break. These DNases have 
different speed of decay, generating some single strand extremities. Once a partial or a 
total complementary sequence is found on these single strand extremities, they anneal 
to generate a DNA molecule with a single strand DNA break on both strands, but not 
at the same position. Such breaks are easily repaired by base excision repair, nucleotide 
excision repair, or mismatch repair mechanisms. 


integrated at the double strand break position.The homologous recombina¬ 
tion allows an exchange of DNA between two DNA molecules surrounded 
by two identical sequences at each extremity (Fig. 3.13). Repairing the 
double strand break by homologous recombination raises the risk of getting 
an insertion of DNA sequence in the gap. This risk becomes a huge advan¬ 
tage when this method will be used for genome editing, since a gene or a 
part of a gene can be introduced at a specific position.This DNA sequence 
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can be a new protein domain, such as a tag, or the replacement of a missing 
gene part by the wild-type sequence. 

The second method, called NHEJ, illustrates how difficult is for the cell 
to solve a double strand break. Basically, exonucleases are recruited at the 
double strand break position in order to degrade the DNA ends at different 
speed, until they are able to generate cohesive or partially cohesive ends, 
transforming a double strand break into two single strand breaks that the 
cell can repair. The consequence is a deletion whose size is depending on 
the sequence (Fig. 3.13). This method will be favored to generate deletions 
in a gene by the editing approach. 

Three editing methods have recently emerged and, even though they 
are not yet included in clinical trials, they open new therapeutic strategies 
by giving the opportunity to modify the genome of a cell with high preci¬ 
sion (Fig. 3.14). 




CRISPR/Cas9 method 



Figure 3.14 Principle of the three main genome editings. 
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8.1 Zinc Finger Nucleases 

The zinc finger nucleases (ZFNs) have been reported since the beginning of 
this century as a tool to induce a double strand break in DNA, at a specific site 
chosen by the experimenter (Moore et al, 2001). Basically, this tool provides 
the opportunity to target a fusion protein on any DNA sequence by chang¬ 
ing the peptidic sequence of its DNA binding domain (Mani et al., 2005). 
ZFNs consist in a fusion between a nonspecific endonuclease activity from 
the C-terminal domain of the restriction enzyme Fokl, and a DNA binding 
domain composed of at least six zinc fingers capable of recognizing 18 bp 
(Fig. 3.14) .The identity of the amino acids in the zinc fingers dictates the DNA 
sequence that is recognized. Indeed, by changing the amino acid sequence, it 
is possible to determine the DNA binding sequence and the cutting site. The 
probability to find a specific sequence of 18 bp is 1/6.8 X 10*°, meaning that 
it would need 22 human genomes to find a given sequence randomly once. 

8.1.1 Weaknesses 

Several weak points are associated to ZFN technology. The first one is com¬ 
mon to any targeting approaches, and is related to specificity, since it cannot 
be excluded that the ZFN cut DNA at off target sites. The second weak 
point is shared with gene therapy limitation, since it is about the method of 
introducing the ZFN or the expression gene encoding the ZFN into the 
patient cells. Finally, the use of ZFN in patient cells involves the expression 
of a new protein that can induce an immune response. 

8.2 Transcription Activator-Like Effector Nucleases 

The transcription activator-Hke effector nucleases (TALEN) approach is based 
on the same principle as ZFN, by expressing a fusion protein made of gener¬ 
ally the endonuclease Fokl to generate the double strand break, and a DNA 
binding domain derived from transcription activator-like (TAL) proteins, 
rather than zinc finger domains, as in ZFNs.These proteins isolated from the 
plant pathogenic Xanthomonas (Bai et ah, 2000; Kay and Bonas, 2009; White 
andYang, 2009;Yang and White, 2004) bind DNA thanks to a central domain 
consisting of about 34 amino acid tandem repeats. A correlation has been 
made between the amino acid sequence of the repeats and the nucleotide 
sequence bound by the TAL (Boch et al.,2009;Moscou and Bogdan ove, 2009). 
In particular, the identity of the amino acids at the position 12 and 13 are 
hypervariable and are responsible for the amino acid recognition specificity. 
Increasing or decreasing the number of repeats in a TALEN enzyme affects 
the specificity of the recognition of the DNA sequence target (Fig. 3.14). 
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8.2.7 Weaknesses 

The same weak points as for ZFN applied to TALEN, since the engineer¬ 
ing is similar, and the principle of the DNA sequence recognition is also 
similar. TALEN have not been included in clinical trials but, as for ZFN, 
they represent a very useful and precise tool to modify the genome of cells. 
Some groups reported the use oflentivirus to deliver a TALEN construct in 
order to perform genome editing in vitro, as well as in vivo, confirming the 
interest in such technology (Mock et ah, 2014). 

8.3 CRISPR/Cas9 

The system CRISPR/Cas9 is the most recently developed genome editing 
system (Cong et ah, 2013;Jinek et ah, 2012; Mali et ah, 2013). The principle 
is different from the two first tools since the recognition of the DNA or 
RNA sequence is performed via a RNA guide. It is a sequence of about 
20 nucleotides that anneals with the target sequence and tethers the nuclease 
Cas9, since it harbors a high affinity binding sequence for this nuclease. It 
is therefore very simple to cut a genomic DNA at a specific locus in a host 
genome by designing a dedicated antisense sequence for the RNA guide. 
Thanks to the length of the targeted sequence, there is a probability of 
1/10'^ to find randomly a specific sequence of20 nucleotides, meaning such 
sequence is present once in an equivalent of about 300 human genomes. 

The attractive point of this approach is the simplicity of the design for 
the targeting domain of the enzyme. The RNA guide and the Cas9 can be 
encoded by two genes present on two different plasmids or on the same 
one. As for the two precedent genome-editing methods, the synthesis of 
enzyme is done by the host cell, meaning that the editing of the genome by 
the CRISPR/Cas9 system only requires a basic transfection of cells. 

Like for Fokl nuclease, the Cas9 cut the two strands of the DNA in a 
nonspecific sequence. CRISPR/Cas9 will then activate the homologous 
recombination or the NHEJ repair system (Fig. 3.14). Therefore it is pos¬ 
sible to replace a gene or a part of a gene via the CRISPR/Cas9 system, 
in vitro, at least. It is a very attractive strategy because the cost is reduced 
and the design of the enzyme is accessible to any labs. This explains the 
increasing number of reports in less than 3 years concerning this method. 
Some systems already exist to deliver DNA molecule in specific cells (see 
Section 6) so they can apply to the CRISPR/Cas9 approach. Some of them 
have been already tested and, for example, the use of adenovirus express¬ 
ing CRISPR/Cas9 has been successfully used in vivo (Cheng et ah, 2014). 
Another way to use CRISPR/Cas9 system in therapeutic approaches is to 
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correct a mutation in vitro in patient cells, and inject back the corrected 
cells into the patient (see Section 7 and the following sections). 

8.3.1 Illustration 

Although the modification of the genome of an entire complex organism is 
extremely challenging, CRISPR/Cas9 already demonstrated the potential 
to be used in a therapeutic development. The author succeeded to reexpress 
the dystrophin gene harboring frameshift mutations by inducing deletions 
via CRISPR/Cas9, in order to catch back the wild-type ORF generating 
an internally truncated dystrophin protein (Ousterout et ah, 2015). Authors 
restored the expression in myoblasts that they injected in mice deficient 
for dystrophin expression. They observed dystrophin protein in few fibers 
in mice injected with cells in which dystrophin expression was restored, 
but not in mice injected with cells in which dystrophin expression was not 
restored, indicating that the few fibers expressing dystrophin are related to 
the use of CRISPR/Cas9 and are not revertant cells. 

8.3.2 Weaknesses 

The two major weak points of the CRISPR/Cas9 approach are the off-targets, 
and the in vivo delivery, as it is the case for the two other genome-editing 
methods. Concerning the off-targets, this has already been demonstrated (Cho 
et al., 2014; Cradick et al., 2013; Fu et al., 2013; ITsu et al., 2013; Ousterout 
et al., 2015; Pattanayak et al., 2013), so the challenge is to reduce the number 
of off-targets as much as possible, before exploring the therapeutic advantage 
of such new technology. For the in vivo delivery concern, genome editing 
requires to be used with other therapeutic strategy such as cell therapy. 

9 COMBINATORY APPROACHES TO IMPROVE NONSENSE 
MUTATION THERAPIES 

Each one of the strategies previously described has strengths and weaknesses 
that can limit their use as therapeutic approaches. However, some of them 
can be used together in order to obtain a stronger, new approach. Some of 
these putative coupled approaches are described next and they only repre¬ 
sent few examples. 

9.1 Activation of Both Transcription and Readthrough 

The low efficiency of PTC-readthrough is often attributed to the low abun¬ 
dance of mRNAs substrates for readthrough, since they are first checked and 
degraded by NMD before they get a chance to be subject to readthrough. 
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A way to overcome such issue is to attempt to saturate the NMD mechanism 
by boosting the transcription. It is likely what is happening in cells expressing 
CFTR mRNA harboring the W1282X nonsense mutation, under the CMV 
promoter, and treated with sodium butyrate (Rowe et ah, 2007). Sodium 
butyrate is known to increase the transcription level of genes under CMV 
promoter and, in this study, the authors succeeded to stabilize very efficiently 
the PTC-containing CFTR niRNA. As a comparison, they reached about 
three times the level of wild-type CFTR mRNA present in Calu-3 cells 
that are already known to overexpress the CFTR gene. The cotreatment of 
sodium butyrate and geneticin induces a higher presence of CFTR at the 
cell membrane than with sodium butyrate alone, and a higher CFTR activ¬ 
ity than with sodium butyrate or geneticin alone. 

9.2 Inhibition of NMD and Activation of Readthrough 

Another way to improve PTC-readthrough by increasing the amount of 
the nonsense mutation-containing mRNA is to inhibit NMD. The effi¬ 
ciency of this approach has been already demonstrated ex vivo by using a 
siRNA raised against an NMD factor and a gentamicin treatment (Linde 
et ah, 2007). Indeed, in this study, the authors showed that the efficiency of 
readthrough by gentamicin was improved in the presence of siRNA UPFl 
or siRNA UPF2 that induce NMD inhibition, but not in the presence of 
a nonspecific siRNA. A similar demonstration has been made in vivo using 
the NMD inhibitor NMDI 1 (Durand et ah, 2007), and the readthrough 
molecule gentamicin in a mouse model harboring the nonsense mutation 
W392X in the a-L-iduronidase (Keeling et ah, 2013). 

Some molecules have the dual property to inhibit NMD and promote 
readthrough. For example, geneticin (G418) is a well-known aminoglyco¬ 
side with a high efficiency of readthrough (Bidou et ah, 2004; Dranchak 
et ah, 2011; Sangkuhl et ah, 2004). In addition, geneticin is also an inhibitor 
of NMD, as it has been demonstrated on DHCR7 mRNA harboring a 
Q98X or aWlSlX nonsense mutation, at least (Correa-Cerro et ah, 2005). 
More recently, amlexanox has been selected in a screen as an NMD 
inhibitor, and it was demonstrated to not only inhibit NMD, but also to 
promote PTC-readthrough (Gonzalez-Hilarion et ah, 2012). Amlexanox 
shows a higher restoration of the CFTR function in cells harboring a Q2X 
nonsense mutation in CFTR gene than PTC124/ataluren molecule, for 
instance. It is therefore possible to find molecules with the dual property 
that seem to be more efficient than molecules only capable of readthrough; 
such molecules could represent a better solution than pure readthrough 
molecules for the correction of nonsense mutations. 
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9.3 Gene Therapy/Genome Editing/Pseudouridylation at the 
PTC and Cell Therapy 

The use of gene therapy, genome editing, or pseudouridylation at the 
PTC shows the same limitations that is the delivery of the nonsense muta¬ 
tion cure to aU or a specific type of the cells of an organism. Rather than 
attempting to deliver the cure directly, it could be more efficient to provide 
the treatment to stem cells or IPS cells, and inject these cured cells in the 
final organism. By this approach, the correction of the nonsense mutation 
or the PTC would be done in vitro, with aU the advantages related to in 
vitro manipulations, such as the control of the experimental parameters, 
for instance. Once the stem cells or the IPS cells are treated, they can be 
injected in the final organism to play their role of replacing differentiated 
cells and, by so doing, to provide the corrected expression of the mutant 
gene. There is stiU a long way needed to study the behavior of the injected 
stem cells or IPS cells in a complex organism, but there is no doubt that 
such approach will be explored in a near future, especially because some 
clinics already propose cell therapy treatment in some countries. 
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1 SUMMARY ON THE DIFFERENT STRATEGIES 
AND THEIR RESULTS 


A wide panel of new therapeutic strategies is explored to propose treat¬ 
ment to genetic diseases. Some of these strategies are very molecular and 
target the DNA molecule (gene therapy, genome editing) or the RNA 
molecule (fraws-splicing, exon skipping, activation of PTC readthrough, in¬ 
hibition of NMD, pseudouridylation), while some others are more cellular 
(cell therapy). Many of them are stiU at the experimental development stage 
(NMD inhibition, pseudouridylation, trans-splicing, genome editing) and 
some have already reached the clinical phases (gene therapy, exon skipping, 
PTC readthrough, or cell therapy). 

They can all be applied to treat patients with a nonsense mutation, even 
though only NMD inhibition and readthrough activation are exclusively 
dedicated to nonsense mutation correction, and could be called nonsense 
mutation therapies. This means that many patients with different patholo¬ 
gies and eventually different types of mutations could benefit from these 
therapeutic approaches. It is important to have several options to treat a 
mutation, because none of the described strategies are perfect, they all have 
some weak points. However, some will be more adapted to correct a spe¬ 
cific nonsense mutation than others because of the mutation and its posi¬ 
tion in the gene, for example, the gene itself, its tissue-specific expression, or 
its level of expression. In addition, developing in parallel several therapeutic 
approaches with convergent applications increases the possibility to deliver 
a solution to treat nonsense mutations faster, since strategies do not prog¬ 
ress at the same speed, and some meet technical limitations that will slow 
down their achievement. For example, the gene delivery method might be 
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a strong limitation for several of the previously described strategies, such as 
gene therapy, pseudouridylation, genome editing, or fraws-splicing. Thera¬ 
peutic approaches using chemical molecules have, then, less issues on those 
particular points and might be able to provide a treatment earlier than other 
strategies that use more complex biology tools. 

The ethical question is also an issue that will slow down the develop¬ 
ment of cell or gene therapy in particular. This question is really important 
and involves not only scientists or clinicians, but also politicians and aU 
citizens, explaining why the use of embryonic stem cells is forbidden in 
Germany and Italy; tightly controlled and restricted in the United States of 
America or France; and much more accessible in other countries, such as 
Austria, Poland, or Ireland since no specific laws determine the use and the 
origin of the stem cells. Therefore, the development of cell therapy will be 
dependent on the countries. 

Overall, results from therapeutic strategies described in this book are 
very encouraging, since the original objective to be able to reexpress the 
mutant gene is reached by several of them (iraws-splicing, PTC-readthrough, 
gene therapy, exon skipping, or cell therapy), at least in vitro and/or in ani¬ 
mal models. Solutions are also in development to improve these individual 
strategies by evaluating combinations of two of them with some already 
positive results (NMD inhibition and PTC readthrough). Treatments are 
not yet on the market, but, with several molecules in clinical phase II or III, 
it should be not too long to see the first treatment for nonsense mutation- 
related diseases released on the market. 


2 PERSONALIZED/TARGETED MEDICINE VERSUS 
TRADITIONAL MEDICINE 

Medicine increasingly treats the causes of pathology, rather than the symp¬ 
toms. Personalized and targeted medication contribute to this modern 
way of thinking in medicine. Without human genetic research programs 
to identify the mutant genes responsible for a pathology and the human 
genome-sequencing project, targeted, and personalized medication would 
not be possible today. It has been a long road to link genetic diseases to pa¬ 
thology and to understand the molecular mechanism behind the develop¬ 
ment of a disease. The function of some genes responsible for a pathology 
when mutated still remains not fuUy understood as, for instance, in neu¬ 
rologic disorders or metabolic diseases for which the genetic link is often 
more recent than for rare diseases. 
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Thanks to the support of patient associations in particular, progresses 
have been made in various aspects of molecular, cellular, and medical biol¬ 
ogy. The fastest and most significant progresses are often related to the size 
of the patient association behind a disease, which is also often related to the 
number of patients affected by that pathology. However, associations of pa¬ 
tients with a rare disease succeeded to stimulate some of the scientific com¬ 
munity to focus on rare genetic diseases. These pathologies are attractive, 
because it is possible to link one disorder to one gene, which is simplier than 
multifactorial pathologies. In theory, the treatment of rare diseases is re¬ 
duced to bring back the missing function encoded by the mutant gene. The 
basic concept is to understand the function of the gene and the molecular 
mechanism that is impaired due to the mutation. Many different approaches 
have been designed as described in this book, focusing on different aspects 
of the diseases or on the element to replace: the mutant gene, the niRNA 
harboring the mutation, or the pathologic cell. Importantly, all these thera¬ 
peutic approaches can apply to any genetic diseases and are not restricted 
to one pathology. Progress made in the rare disease area might be of interest 
for more frequent diseases such as cancer. 

Thanks to the development of aU these therapeutic approaches; soon, 
patients sharing the same pathology will not receive the same treatment. 
This wiU be in order to cure the cause of the pathology, rather than the 
consequences. In addition, patients with different pathologies might receive 
the same treatment if the mutation at the origin of the pathology belongs 
to the same category and can be treated in a similar way, molecularly speak¬ 
ing. A treatment will not be dedicated to one pathology but to one type of 
mutation, which is really a new way of thinking. 

3 LIMITATIONS ON NONSENSE MUTATION THERAPIES 
AND FUTURE CONSIDERATIONS 

The nonsense mutation therapies exclusively dedicated to nonsense muta¬ 
tion correction (inhibition of NMD and activation of readthrough) target 
the mutant mRNA and do not affect the patient genome. This is certainly 
a very attractive ethic advantage since the patient wiU not become a ge¬ 
netically modified organism and the next generation will not be affected. 
However, such approach is also a potential problem after several genera¬ 
tions. With these approaches, the mutation in the gene is not corrected, 
since the treatment focuses only on the mutant niRNA, making the muta¬ 
tion stable in the population of healthy people. The consequence is that 
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nonsense mutations will be stabilized in the human population and might 
progressively conquer the human DNA patrimony, if treatment is provided 
for several generations. Therefore nonsense mutation therapies will have to 
be considered as a temporary treatment limited to several generations, until 
other therapies focusing on gene repair become available. Nonsense muta¬ 
tion therapies have to be developed because they are the ones that progress 
the fastest, meaning that they will be available before other options could 
be proposed to patients. 

Another consideration that has to be kept in mind with nonsense mu¬ 
tation therapies is that both are related to the use of chemical molecules. 
We know that the treatment will have to be taken for the entire life of the 
patient. Besides the side effects due to a very long period of exposure, such 
long time treatment often becomes less efficient after a while. The decrease 
of treatment efficiency is related to an adaptation of the human body to 
the presence of the drug by developing some mechanisms to block the 
entrance of the molecule in cells, exclude it from cells, or metabolize it 
in order to transform it into an inactive molecule. These mechanisms are 
called drug resistance mechanisms and are well-known phenomena of long 
time treatments, in particular for anticancer treatments or antivirus treat¬ 
ments (Clavel and Hance, 2004) . This means that several molecules with 
similar efficiency for the nonsense mutation correction, but with a different 
mode of action have to be available in order to be able to change the drug 
when it is less efficient. This issue has already been anticipated, since besides 
ataluren, several other molecules are being studied (amlexanox, RTC13, 
RTC14 molecules). This is also the reason why the mode of action of each 
molecule has to be determined, in order to ensure that the target protein is 
different for each drug. 

Although some treatments to correct nonsense mutations reach the final 
step before being released on the market, the identification and the charac¬ 
terization of new molecules and treatments are still needed. For that, new 
technologies, new approaches, and new targets have to be developed and 
studied, in order to be able to propose a wide panel of solutions adapted to 
each patient. 
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GLOSSARY 


De novo mutation New mutation present in a family member and absent in parents. Such 
mutation appears either in one of the germ cells of parents or in the fertilized egg at the 
origin of the family member carrier of the novo mutation. 

Frameshift mutation Mutation inducing a change of the reading frame. This shift can be 
towards the !>' end of the mRNA, in which case it is a positive frameshift. When the shift 
is towards the 5^ end of the mRNA, it is a negative frameshift.The frameshift can be minus 
or plus 1, indicating that it corresponds to an insertion or a deletion of a multiple of three 
plus 1 nucleotide. A frameshift plus or minus 2 corresponds to an insertion or a deletion of 
a multiple of three plus 2 nucleotides. 

Homologous Two genes are homologous when they share an ancestral gene at the origin 
of their evolution. 

Missense mutation Mutation leading to the change of one amino acid by another one. 

Nonsense mutation Mutation changing a translatable codon into an untranslatable codon 
called nonsense codon or stop codon, interrupting the open reading frame. 

Orthologous Two genes are orthologous when they are homologous and present in two 
different organisms. For example, UPFl gene in yeast is an ortholog of UPFl gene in 
human. 

Paralogous Two genes are paralogous when they are homologous and present in the same 
organism. For example, in mammalian cells, UPF3/UPF3a gene is a paralog of UPF3X/ 
UPF3b. 

Personalized therapy Treatment designed for a unique patient according to the molecular 
origin of the pathology and his/her genetic and biologic backgrounds. 

Targeted therapy Treatment designed for some but not all patients affected by the same 
pathology or not. Targeted therapies are designed according to the molecular mecha¬ 
nism and/or the type of mutation promoting the pathology. Patients who benefit from a 
targeted therapy share the same molecular causes at the origin of the pathology. 

Transition A transition is a mutation changing a purine base (guanosine or adenosine) 
into a purine base, or a pyrimidine base (cytosine or thymidine) into a pyrimidine base. 

Transversion A transversion is a mutation changing a purine base (guanosine or adenosine) 
into a pyrimidine base (cytosine or thymidine), or a pyrimidine base into a purine base. 
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